mirror of
https://github.com/bytedance/deer-flow.git
synced 2026-06-10 01:15:58 +00:00
Compare commits
10 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| d7a2fff7e0 | |||
| eabd78ce4e | |||
| 533d3fbfee | |||
| d6b3a277a5 | |||
| def2a3ad79 | |||
| 3c0b42d836 | |||
| 34ec205e1d | |||
| 11a9041b65 | |||
| d3066a1746 | |||
| 485f8a2bf2 |
@@ -59,7 +59,7 @@ smoke-test/
|
|||||||
2. **Check pnpm** - Package manager
|
2. **Check pnpm** - Package manager
|
||||||
3. **Check uv** - Python package manager
|
3. **Check uv** - Python package manager
|
||||||
4. **Check nginx** - Reverse proxy
|
4. **Check nginx** - Reverse proxy
|
||||||
5. **Check required ports** - Confirm that ports 2026, 3000, and 8001 are not occupied
|
5. **Check required ports** - Confirm that ports 2026, 3000, 8001, and 2024 are not occupied
|
||||||
|
|
||||||
**Docker mode environment check** (if Docker is selected):
|
**Docker mode environment check** (if Docker is selected):
|
||||||
1. **Check whether Docker is installed** - Run `docker --version`
|
1. **Check whether Docker is installed** - Run `docker --version`
|
||||||
@@ -93,17 +93,17 @@ smoke-test/
|
|||||||
### Phase 5: Service Health Check
|
### Phase 5: Service Health Check
|
||||||
|
|
||||||
**Local mode health check**:
|
**Local mode health check**:
|
||||||
1. **Check process status** - Confirm that Gateway, Frontend, and Nginx processes are all running
|
1. **Check process status** - Confirm that LangGraph, Gateway, Frontend, and Nginx processes are all running
|
||||||
2. **Check frontend service** - Visit `http://localhost:2026` and verify that the page loads
|
2. **Check frontend service** - Visit `http://localhost:2026` and verify that the page loads
|
||||||
3. **Check API Gateway** - Verify the `http://localhost:2026/health` endpoint
|
3. **Check API Gateway** - Verify the `http://localhost:2026/health` endpoint
|
||||||
4. **Check LangGraph-compatible API** - Verify the `/api/langgraph/*` route exposed by Gateway
|
4. **Check LangGraph service** - Verify the availability of relevant endpoints
|
||||||
5. **Frontend route smoke check** - Run `bash .agent/skills/smoke-test/scripts/frontend_check.sh` to verify key routes under `/workspace`
|
5. **Frontend route smoke check** - Run `bash .agent/skills/smoke-test/scripts/frontend_check.sh` to verify key routes under `/workspace`
|
||||||
|
|
||||||
**Docker mode health check** (when using Docker):
|
**Docker mode health check** (when using Docker):
|
||||||
1. **Check container status** - Run `docker ps` and confirm that all containers are running
|
1. **Check container status** - Run `docker ps` and confirm that all containers are running
|
||||||
2. **Check frontend service** - Visit `http://localhost:2026` and verify that the page loads
|
2. **Check frontend service** - Visit `http://localhost:2026` and verify that the page loads
|
||||||
3. **Check API Gateway** - Verify the `http://localhost:2026/health` endpoint
|
3. **Check API Gateway** - Verify the `http://localhost:2026/health` endpoint
|
||||||
4. **Check LangGraph-compatible API** - Verify the `/api/langgraph/*` route exposed by Gateway
|
4. **Check LangGraph service** - Verify the availability of relevant endpoints
|
||||||
5. **Frontend route smoke check** - Run `bash .agent/skills/smoke-test/scripts/frontend_check.sh` to verify key routes under `/workspace`
|
5. **Frontend route smoke check** - Run `bash .agent/skills/smoke-test/scripts/frontend_check.sh` to verify key routes under `/workspace`
|
||||||
|
|
||||||
### Optional Functional Verification
|
### Optional Functional Verification
|
||||||
@@ -135,7 +135,7 @@ smoke-test/
|
|||||||
|
|
||||||
The following warnings can appear during smoke testing and do not block a successful result:
|
The following warnings can appear during smoke testing and do not block a successful result:
|
||||||
- Feishu/Lark SSL errors in Gateway logs (certificate verification failure) can be ignored if that channel is not enabled
|
- Feishu/Lark SSL errors in Gateway logs (certificate verification failure) can be ignored if that channel is not enabled
|
||||||
- Warnings in Gateway logs about missing methods in the custom checkpointer, such as `adelete_for_runs` or `aprune`, do not affect the core functionality
|
- Warnings in LangGraph logs about missing methods in the custom checkpointer, such as `adelete_for_runs` or `aprune`, do not affect the core functionality
|
||||||
|
|
||||||
## Key Tools
|
## Key Tools
|
||||||
|
|
||||||
|
|||||||
@@ -138,6 +138,7 @@ This document describes the detailed operating steps for each phase of the DeerF
|
|||||||
lsof -i :2026 # Main port
|
lsof -i :2026 # Main port
|
||||||
lsof -i :3000 # Frontend
|
lsof -i :3000 # Frontend
|
||||||
lsof -i :8001 # Gateway
|
lsof -i :8001 # Gateway
|
||||||
|
lsof -i :2024 # LangGraph
|
||||||
```
|
```
|
||||||
|
|
||||||
**Success Criteria**: All ports are free, or they are occupied only by DeerFlow-related processes.
|
**Success Criteria**: All ports are free, or they are occupied only by DeerFlow-related processes.
|
||||||
@@ -257,7 +258,7 @@ This document describes the detailed operating steps for each phase of the DeerF
|
|||||||
**Steps**:
|
**Steps**:
|
||||||
1. Run `make dev-daemon` (background mode)
|
1. Run `make dev-daemon` (background mode)
|
||||||
|
|
||||||
**Description**: This command starts all services (Gateway embedded runtime, Frontend, Nginx).
|
**Description**: This command starts all services (LangGraph, Gateway, Frontend, Nginx).
|
||||||
|
|
||||||
**Notes**:
|
**Notes**:
|
||||||
- `make dev` runs in the foreground and stops with Ctrl+C
|
- `make dev` runs in the foreground and stops with Ctrl+C
|
||||||
@@ -271,6 +272,7 @@ This document describes the detailed operating steps for each phase of the DeerF
|
|||||||
**Steps**:
|
**Steps**:
|
||||||
1. Wait 90-120 seconds for all services to start completely
|
1. Wait 90-120 seconds for all services to start completely
|
||||||
2. You can monitor startup progress by checking these log files:
|
2. You can monitor startup progress by checking these log files:
|
||||||
|
- `logs/langgraph.log`
|
||||||
- `logs/gateway.log`
|
- `logs/gateway.log`
|
||||||
- `logs/frontend.log`
|
- `logs/frontend.log`
|
||||||
- `logs/nginx.log`
|
- `logs/nginx.log`
|
||||||
@@ -314,10 +316,11 @@ This document describes the detailed operating steps for each phase of the DeerF
|
|||||||
**Steps**:
|
**Steps**:
|
||||||
1. Run the following command to check processes:
|
1. Run the following command to check processes:
|
||||||
```bash
|
```bash
|
||||||
ps aux | grep -E "(uvicorn|next|nginx)" | grep -v grep
|
ps aux | grep -E "(langgraph|uvicorn|next|nginx)" | grep -v grep
|
||||||
```
|
```
|
||||||
|
|
||||||
**Success Criteria**: Confirm that the following processes are running:
|
**Success Criteria**: Confirm that the following processes are running:
|
||||||
|
- LangGraph (`langgraph dev`)
|
||||||
- Gateway (`uvicorn app.gateway.app:app`)
|
- Gateway (`uvicorn app.gateway.app:app`)
|
||||||
- Frontend (`next dev` or `next start`)
|
- Frontend (`next dev` or `next start`)
|
||||||
- Nginx (`nginx`)
|
- Nginx (`nginx`)
|
||||||
@@ -353,11 +356,10 @@ curl http://localhost:2026/health
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
#### 5.1.4 Check LangGraph-compatible API
|
#### 5.1.4 Check LangGraph Service
|
||||||
|
|
||||||
**Steps**:
|
**Steps**:
|
||||||
1. Visit `http://localhost:2026/api/langgraph/assistants/lead_agent` to verify Gateway's LangGraph-compatible API route is reachable.
|
1. Visit relevant LangGraph endpoints to verify availability
|
||||||
2. A `401` response is acceptable when authentication is enabled and no session cookie is provided.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -371,6 +373,7 @@ curl http://localhost:2026/health
|
|||||||
- `deer-flow-nginx`
|
- `deer-flow-nginx`
|
||||||
- `deer-flow-frontend`
|
- `deer-flow-frontend`
|
||||||
- `deer-flow-gateway`
|
- `deer-flow-gateway`
|
||||||
|
- `deer-flow-langgraph` (if not in gateway mode)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -403,11 +406,10 @@ curl http://localhost:2026/health
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
#### 5.2.4 Check LangGraph-compatible API
|
#### 5.2.4 Check LangGraph Service
|
||||||
|
|
||||||
**Steps**:
|
**Steps**:
|
||||||
1. Visit `http://localhost:2026/api/langgraph/assistants/lead_agent` to verify Gateway's LangGraph-compatible API route is reachable.
|
1. Visit relevant LangGraph endpoints to verify availability
|
||||||
2. A `401` response is acceptable when authentication is enabled and no session cookie is provided.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
@@ -254,6 +254,7 @@ Processes exit quickly after running `make dev-daemon`.
|
|||||||
**Solutions**:
|
**Solutions**:
|
||||||
1. Check log files:
|
1. Check log files:
|
||||||
```bash
|
```bash
|
||||||
|
tail -f logs/langgraph.log
|
||||||
tail -f logs/gateway.log
|
tail -f logs/gateway.log
|
||||||
tail -f logs/frontend.log
|
tail -f logs/frontend.log
|
||||||
tail -f logs/nginx.log
|
tail -f logs/nginx.log
|
||||||
@@ -366,7 +367,24 @@ Errors appear in `gateway.log`.
|
|||||||
uv sync
|
uv sync
|
||||||
```
|
```
|
||||||
|
|
||||||
4. Confirm that the Gateway process is running normally.
|
4. Confirm that the LangGraph service is running normally (if not in gateway mode)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Issue: LangGraph Fails to Start
|
||||||
|
|
||||||
|
**Symptoms**:
|
||||||
|
Errors appear in `langgraph.log`.
|
||||||
|
|
||||||
|
**Solutions**:
|
||||||
|
1. Check LangGraph logs:
|
||||||
|
```bash
|
||||||
|
tail -f logs/langgraph.log
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Check config.yaml
|
||||||
|
3. Check whether Python dependencies are complete
|
||||||
|
4. Confirm that port 2024 is not occupied
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -501,7 +519,7 @@ Accessing `/health` returns an error or times out.
|
|||||||
|
|
||||||
2. Confirm that config.yaml exists and has valid formatting
|
2. Confirm that config.yaml exists and has valid formatting
|
||||||
3. Check whether Python dependencies are complete
|
3. Check whether Python dependencies are complete
|
||||||
4. Confirm that the Gateway process is running normally.
|
4. Confirm that the LangGraph service is running normally
|
||||||
|
|
||||||
**Solutions** (Docker mode):
|
**Solutions** (Docker mode):
|
||||||
1. Check gateway container logs:
|
1. Check gateway container logs:
|
||||||
@@ -511,7 +529,7 @@ Accessing `/health` returns an error or times out.
|
|||||||
|
|
||||||
2. Confirm that config.yaml is mounted correctly
|
2. Confirm that config.yaml is mounted correctly
|
||||||
3. Check whether Python dependencies are complete
|
3. Check whether Python dependencies are complete
|
||||||
4. Confirm that the Gateway process is running normally.
|
4. Confirm that the LangGraph service is running normally
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -521,7 +539,7 @@ Accessing `/health` returns an error or times out.
|
|||||||
|
|
||||||
#### View All Service Processes
|
#### View All Service Processes
|
||||||
```bash
|
```bash
|
||||||
ps aux | grep -E "(uvicorn|next|nginx)" | grep -v grep
|
ps aux | grep -E "(langgraph|uvicorn|next|nginx)" | grep -v grep
|
||||||
```
|
```
|
||||||
|
|
||||||
#### View Service Logs
|
#### View Service Logs
|
||||||
@@ -530,6 +548,7 @@ ps aux | grep -E "(uvicorn|next|nginx)" | grep -v grep
|
|||||||
tail -f logs/*.log
|
tail -f logs/*.log
|
||||||
|
|
||||||
# View specific service logs
|
# View specific service logs
|
||||||
|
tail -f logs/langgraph.log
|
||||||
tail -f logs/gateway.log
|
tail -f logs/gateway.log
|
||||||
tail -f logs/frontend.log
|
tail -f logs/frontend.log
|
||||||
tail -f logs/nginx.log
|
tail -f logs/nginx.log
|
||||||
|
|||||||
@@ -65,7 +65,7 @@ if ! command -v lsof >/dev/null 2>&1; then
|
|||||||
echo " Install lsof and rerun this check"
|
echo " Install lsof and rerun this check"
|
||||||
all_passed=false
|
all_passed=false
|
||||||
else
|
else
|
||||||
for port in 2026 3000 8001; do
|
for port in 2026 3000 8001 2024; do
|
||||||
if lsof -i :$port >/dev/null 2>&1; then
|
if lsof -i :$port >/dev/null 2>&1; then
|
||||||
echo "⚠ Port $port is already in use:"
|
echo "⚠ Port $port is already in use:"
|
||||||
lsof -i :$port | head -2
|
lsof -i :$port | head -2
|
||||||
|
|||||||
@@ -54,6 +54,7 @@ echo "=========================================="
|
|||||||
echo ""
|
echo ""
|
||||||
echo "🌐 Access URL: http://localhost:2026"
|
echo "🌐 Access URL: http://localhost:2026"
|
||||||
echo "📋 View logs:"
|
echo "📋 View logs:"
|
||||||
|
echo " - logs/langgraph.log"
|
||||||
echo " - logs/gateway.log"
|
echo " - logs/gateway.log"
|
||||||
echo " - logs/frontend.log"
|
echo " - logs/frontend.log"
|
||||||
echo " - logs/nginx.log"
|
echo " - logs/nginx.log"
|
||||||
|
|||||||
@@ -76,11 +76,12 @@ if [ "$mode" = "docker" ]; then
|
|||||||
all_passed=false
|
all_passed=false
|
||||||
fi
|
fi
|
||||||
else
|
else
|
||||||
summary_hint="logs/{gateway,frontend,nginx}.log"
|
summary_hint="logs/{langgraph,gateway,frontend,nginx}.log"
|
||||||
print_step "1. Checking local service ports..."
|
print_step "1. Checking local service ports..."
|
||||||
check_listen_port "Nginx" 2026
|
check_listen_port "Nginx" 2026
|
||||||
check_listen_port "Frontend" 3000
|
check_listen_port "Frontend" 3000
|
||||||
check_listen_port "Gateway" 8001
|
check_listen_port "Gateway" 8001
|
||||||
|
check_listen_port "LangGraph" 2024
|
||||||
fi
|
fi
|
||||||
echo ""
|
echo ""
|
||||||
|
|
||||||
@@ -103,8 +104,8 @@ else
|
|||||||
fi
|
fi
|
||||||
echo ""
|
echo ""
|
||||||
|
|
||||||
echo "5. Checking LangGraph-compatible Gateway API..."
|
echo "5. Checking LangGraph service..."
|
||||||
check_http_status "LangGraph-compatible Gateway API" "http://localhost:2026/api/langgraph/assistants/lead_agent" "200|401"
|
check_http_status "LangGraph service" "http://localhost:2024/" "200|301|302|307|308|404"
|
||||||
echo ""
|
echo ""
|
||||||
|
|
||||||
echo "=========================================="
|
echo "=========================================="
|
||||||
|
|||||||
@@ -78,7 +78,7 @@
|
|||||||
- [x] Container status - {{status_containers}}
|
- [x] Container status - {{status_containers}}
|
||||||
- [x] Frontend service - {{status_frontend}}
|
- [x] Frontend service - {{status_frontend}}
|
||||||
- [x] API Gateway - {{status_api_gateway}}
|
- [x] API Gateway - {{status_api_gateway}}
|
||||||
- [x] LangGraph-compatible Gateway API - {{status_langgraph}}
|
- [x] LangGraph service - {{status_langgraph}}
|
||||||
|
|
||||||
**Phase Status**: {{stage5_status}}
|
**Phase Status**: {{stage5_status}}
|
||||||
|
|
||||||
@@ -147,6 +147,7 @@ Commit Message: {{git_commit_message}}
|
|||||||
| deer-flow-nginx | {{nginx_status}} | {{nginx_uptime}} |
|
| deer-flow-nginx | {{nginx_status}} | {{nginx_uptime}} |
|
||||||
| deer-flow-frontend | {{frontend_status}} | {{frontend_uptime}} |
|
| deer-flow-frontend | {{frontend_status}} | {{frontend_uptime}} |
|
||||||
| deer-flow-gateway | {{gateway_status}} | {{gateway_uptime}} |
|
| deer-flow-gateway | {{gateway_status}} | {{gateway_uptime}} |
|
||||||
|
| deer-flow-langgraph | {{langgraph_status}} | {{langgraph_uptime}} |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
@@ -80,7 +80,7 @@
|
|||||||
- [x] Process status - {{status_processes}}
|
- [x] Process status - {{status_processes}}
|
||||||
- [x] Frontend service - {{status_frontend}}
|
- [x] Frontend service - {{status_frontend}}
|
||||||
- [x] API Gateway - {{status_api_gateway}}
|
- [x] API Gateway - {{status_api_gateway}}
|
||||||
- [x] LangGraph-compatible Gateway API - {{status_langgraph}}
|
- [x] LangGraph service - {{status_langgraph}}
|
||||||
|
|
||||||
**Phase Status**: {{stage5_status}}
|
**Phase Status**: {{stage5_status}}
|
||||||
|
|
||||||
@@ -152,7 +152,7 @@ Commit Message: {{git_commit_message}}
|
|||||||
| Nginx | {{nginx_status}} | {{nginx_endpoint}} |
|
| Nginx | {{nginx_status}} | {{nginx_endpoint}} |
|
||||||
| Frontend | {{frontend_status}} | {{frontend_endpoint}} |
|
| Frontend | {{frontend_status}} | {{frontend_endpoint}} |
|
||||||
| Gateway | {{gateway_status}} | {{gateway_endpoint}} |
|
| Gateway | {{gateway_status}} | {{gateway_endpoint}} |
|
||||||
| Gateway LangGraph API | {{langgraph_status}} | {{langgraph_endpoint}} |
|
| LangGraph | {{langgraph_status}} | {{langgraph_endpoint}} |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -166,7 +166,7 @@ Commit Message: {{git_commit_message}}
|
|||||||
|
|
||||||
### If the Test Fails
|
### If the Test Fails
|
||||||
1. [ ] Review references/troubleshooting.md for common solutions
|
1. [ ] Review references/troubleshooting.md for common solutions
|
||||||
2. [ ] Check local logs: `logs/{gateway,frontend,nginx}.log`
|
2. [ ] Check local logs: `logs/{langgraph,gateway,frontend,nginx}.log`
|
||||||
3. [ ] Verify configuration file format and content
|
3. [ ] Verify configuration file format and content
|
||||||
4. [ ] If needed, fully reset the environment: `make stop && make clean && make install && make dev-daemon`
|
4. [ ] If needed, fully reset the environment: `make stop && make clean && make install && make dev-daemon`
|
||||||
|
|
||||||
|
|||||||
@@ -50,11 +50,6 @@ INFOQUEST_API_KEY=your-infoquest-api-key
|
|||||||
# Set to "false" to disable Swagger UI, ReDoc, and OpenAPI schema in production
|
# Set to "false" to disable Swagger UI, ReDoc, and OpenAPI schema in production
|
||||||
# GATEWAY_ENABLE_DOCS=false
|
# GATEWAY_ENABLE_DOCS=false
|
||||||
|
|
||||||
# Shared internal Gateway auth token for multi-worker deployments.
|
|
||||||
# `make up` generates and persists this automatically; set it manually only
|
|
||||||
# when you run Gateway workers outside the bundled deploy script.
|
|
||||||
# DEER_FLOW_INTERNAL_AUTH_TOKEN=your-shared-internal-token
|
|
||||||
|
|
||||||
# ── Frontend SSR → Gateway wiring ─────────────────────────────────────────────
|
# ── Frontend SSR → Gateway wiring ─────────────────────────────────────────────
|
||||||
# The Next.js server uses these to reach the Gateway during SSR (auth checks,
|
# The Next.js server uses these to reach the Gateway during SSR (auth checks,
|
||||||
# /api/* rewrites). They default to localhost values that match `make dev` and
|
# /api/* rewrites). They default to localhost values that match `make dev` and
|
||||||
|
|||||||
@@ -1,61 +0,0 @@
|
|||||||
<!-- Reference a related issue with #123. Use Fixes / Closes / Resolves to
|
|
||||||
auto-close it on merge. Delete this line if the PR doesn't reference an issue. -->
|
|
||||||
Fixes #
|
|
||||||
|
|
||||||
## Why
|
|
||||||
|
|
||||||
<!-- Why are you opening this PR? Cover two things:
|
|
||||||
- The trigger — what made you write this? A bug you hit, a feature you need,
|
|
||||||
tech debt, or a prod issue?
|
|
||||||
- The pain being addressed — user-facing problem, or what it unblocks.
|
|
||||||
For non-trivial features, please open an issue/discussion first to align on
|
|
||||||
scope before writing code. -->
|
|
||||||
|
|
||||||
|
|
||||||
## What changed
|
|
||||||
|
|
||||||
<!-- Describe the change from a user's / caller's perspective, not as a code diff. e.g.:
|
|
||||||
- "Settings now has a 'Custom endpoint' field, off by default"
|
|
||||||
- "Backend /api/chat gains a `stream` flag, defaults to false"
|
|
||||||
- "Default model changed from X to Y — existing users notice on first run" -->
|
|
||||||
|
|
||||||
|
|
||||||
## Surface area
|
|
||||||
|
|
||||||
<!-- Check every box that applies. Reviewers use this to scope the review. -->
|
|
||||||
|
|
||||||
- [ ] **Frontend UI** — page / component / setting / interaction under `frontend/`
|
|
||||||
- [ ] **Backend API** — endpoint / SSE event / request-response shape under `backend/app`
|
|
||||||
- [ ] **Agents / LangGraph** — agent node, graph wiring, `langgraph.json`, or prompt change
|
|
||||||
- [ ] **Sandbox** — `docker/` or sandboxed execution
|
|
||||||
- [ ] **Skills** — change under `skills/`
|
|
||||||
- [ ] **Dependencies** — new/upgraded entry in `backend/pyproject.toml` or `frontend/package.json` (say what it buys us)
|
|
||||||
- [ ] **Default behavior change** — changes existing behavior without the user opting in (default model, default setting, data shape)
|
|
||||||
- [ ] **Docs / tests / CI only** — no runtime behavior change
|
|
||||||
|
|
||||||
|
|
||||||
## Screenshots / Recording
|
|
||||||
|
|
||||||
<!-- If you checked "Frontend UI", attach screenshots showing the entry point —
|
|
||||||
where users discover the change — not just the feature in isolation.
|
|
||||||
Before/after is best for behavior changes. Short GIFs welcome. -->
|
|
||||||
|
|
||||||
|
|
||||||
## Bug fix verification
|
|
||||||
|
|
||||||
<!-- Skip (delete) this section if this PR is not a bug fix.
|
|
||||||
|
|
||||||
Bugs should be encoded as a failing test that goes red before the fix.
|
|
||||||
Confirm:
|
|
||||||
- Test path that reproduces the bug:
|
|
||||||
- Did it go red on `main` and green on this branch? (yes / no)
|
|
||||||
- If a red test wasn't cheap to write, explain why and what you did instead. -->
|
|
||||||
|
|
||||||
|
|
||||||
## Validation
|
|
||||||
|
|
||||||
<!-- What you actually ran. Run at least the checks for the area you changed:
|
|
||||||
Backend: cd backend && make lint && make test
|
|
||||||
Frontend: cd frontend && pnpm format && pnpm lint && pnpm typecheck && BETTER_AUTH_SECRET=local-dev-secret pnpm build && make test
|
|
||||||
Frontend E2E (if you touched frontend/): cd frontend && make test-e2e -->
|
|
||||||
|
|
||||||
@@ -1,46 +0,0 @@
|
|||||||
name: Backend Blocking IO
|
|
||||||
|
|
||||||
on:
|
|
||||||
push:
|
|
||||||
branches: ["main"]
|
|
||||||
paths:
|
|
||||||
- "backend/**"
|
|
||||||
- ".github/workflows/backend-blocking-io-tests.yml"
|
|
||||||
pull_request:
|
|
||||||
types: [opened, synchronize, reopened, ready_for_review]
|
|
||||||
paths:
|
|
||||||
- "backend/**"
|
|
||||||
- ".github/workflows/backend-blocking-io-tests.yml"
|
|
||||||
|
|
||||||
concurrency:
|
|
||||||
group: blocking-io-${{ github.event.pull_request.number || github.ref }}
|
|
||||||
cancel-in-progress: true
|
|
||||||
|
|
||||||
permissions:
|
|
||||||
contents: read
|
|
||||||
|
|
||||||
jobs:
|
|
||||||
backend-blocking-io:
|
|
||||||
if: github.event_name != 'pull_request' || github.event.pull_request.draft == false
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
timeout-minutes: 10
|
|
||||||
|
|
||||||
steps:
|
|
||||||
- name: Checkout
|
|
||||||
uses: actions/checkout@v4
|
|
||||||
|
|
||||||
- name: Set up Python
|
|
||||||
uses: actions/setup-python@v5
|
|
||||||
with:
|
|
||||||
python-version: "3.12"
|
|
||||||
|
|
||||||
- name: Install uv
|
|
||||||
uses: astral-sh/setup-uv@v3
|
|
||||||
|
|
||||||
- name: Install backend dependencies
|
|
||||||
working-directory: backend
|
|
||||||
run: uv sync --group dev
|
|
||||||
|
|
||||||
- name: Run blocking IO regression tests
|
|
||||||
working-directory: backend
|
|
||||||
run: make test-blocking-io
|
|
||||||
@@ -1,6 +1,6 @@
|
|||||||
# DeerFlow - Unified Development Environment
|
# DeerFlow - Unified Development Environment
|
||||||
|
|
||||||
.PHONY: help config config-upgrade check install setup doctor detect-thread-boundaries detect-blocking-io dev dev-daemon start start-daemon stop up down clean docker-init docker-start docker-stop docker-logs docker-logs-frontend docker-logs-gateway
|
.PHONY: help config config-upgrade check install setup doctor dev dev-daemon start start-daemon stop up down clean docker-init docker-start docker-stop docker-logs docker-logs-frontend docker-logs-gateway
|
||||||
|
|
||||||
BASH ?= bash
|
BASH ?= bash
|
||||||
BACKEND_UV_RUN = cd backend && uv run
|
BACKEND_UV_RUN = cd backend && uv run
|
||||||
@@ -23,8 +23,6 @@ help:
|
|||||||
@echo " make config - Generate local config files (aborts if config already exists)"
|
@echo " make config - Generate local config files (aborts if config already exists)"
|
||||||
@echo " make config-upgrade - Merge new fields from config.example.yaml into config.yaml"
|
@echo " make config-upgrade - Merge new fields from config.example.yaml into config.yaml"
|
||||||
@echo " make check - Check if all required tools are installed"
|
@echo " make check - Check if all required tools are installed"
|
||||||
@echo " make detect-thread-boundaries - Inventory async/thread boundary points"
|
|
||||||
@echo " make detect-blocking-io - Inventory blocking IO that may block the backend event loop"
|
|
||||||
@echo " make install - Install all dependencies (frontend + backend + pre-commit hooks)"
|
@echo " make install - Install all dependencies (frontend + backend + pre-commit hooks)"
|
||||||
@echo " make setup-sandbox - Pre-pull sandbox container image (recommended)"
|
@echo " make setup-sandbox - Pre-pull sandbox container image (recommended)"
|
||||||
@echo " make dev - Start all services in development mode (with hot-reloading)"
|
@echo " make dev - Start all services in development mode (with hot-reloading)"
|
||||||
@@ -53,12 +51,6 @@ setup:
|
|||||||
doctor:
|
doctor:
|
||||||
@$(BACKEND_UV_RUN) python ../scripts/doctor.py
|
@$(BACKEND_UV_RUN) python ../scripts/doctor.py
|
||||||
|
|
||||||
detect-thread-boundaries:
|
|
||||||
@$(PYTHON) ./scripts/detect_thread_boundaries.py
|
|
||||||
|
|
||||||
detect-blocking-io:
|
|
||||||
@$(MAKE) -C backend detect-blocking-io
|
|
||||||
|
|
||||||
config:
|
config:
|
||||||
@$(PYTHON) ./scripts/configure.py
|
@$(PYTHON) ./scripts/configure.py
|
||||||
|
|
||||||
|
|||||||
@@ -546,15 +546,6 @@ LANGFUSE_BASE_URL=https://cloud.langfuse.com
|
|||||||
|
|
||||||
If you are using a self-hosted Langfuse instance, set `LANGFUSE_BASE_URL` to your deployment URL.
|
If you are using a self-hosted Langfuse instance, set `LANGFUSE_BASE_URL` to your deployment URL.
|
||||||
|
|
||||||
**Trace correlation fields.** Every agent run is annotated with Langfuse's reserved trace attributes so the Sessions and Users pages light up automatically:
|
|
||||||
|
|
||||||
- `session_id` = LangGraph `thread_id` — groups every trace of the same conversation
|
|
||||||
- `user_id` = effective user from `get_effective_user_id()` (falls back to `default` in no-auth mode)
|
|
||||||
- `trace_name` = assistant id (defaults to `lead-agent`)
|
|
||||||
- `tags` = `[env:<DEER_FLOW_ENV>, model:<model_name>]` (omitted when not set)
|
|
||||||
|
|
||||||
These are injected into `RunnableConfig.metadata` at the graph invocation root for both the gateway path (`runtime/runs/worker.py::run_agent`) and the embedded path (`client.py::DeerFlowClient.stream`), so any LangChain-compatible callback can read them. Set `DEER_FLOW_ENV` (or `ENVIRONMENT`) to tag traces by deployment environment.
|
|
||||||
|
|
||||||
#### Using Both Providers
|
#### Using Both Providers
|
||||||
|
|
||||||
If both LangSmith and Langfuse are enabled, DeerFlow attaches both tracing callbacks and reports the same model activity to both systems.
|
If both LangSmith and Langfuse are enabled, DeerFlow attaches both tracing callbacks and reports the same model activity to both systems.
|
||||||
@@ -740,12 +731,6 @@ DeerFlow has key high-privilege capabilities including **system command executio
|
|||||||
We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, workflow, and guidelines.
|
We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, workflow, and guidelines.
|
||||||
|
|
||||||
Regression coverage includes Docker sandbox mode detection and provisioner kubeconfig-path handling tests in `backend/tests/`.
|
Regression coverage includes Docker sandbox mode detection and provisioner kubeconfig-path handling tests in `backend/tests/`.
|
||||||
Backend blocking-IO diagnostics are available from the repository root with
|
|
||||||
`make detect-blocking-io`: it statically scans backend business code for
|
|
||||||
blocking IO that may run on the backend event loop, prints a concise summary,
|
|
||||||
and writes complete JSON findings to `.deer-flow/blocking-io-findings.json`.
|
|
||||||
The JSON includes compact review records with `priority`, `location`,
|
|
||||||
`blocking_call`, `event_loop_exposure`, `reason`, and `code`.
|
|
||||||
Gateway artifact serving now forces active web content types (`text/html`, `application/xhtml+xml`, `image/svg+xml`) to download as attachments instead of inline rendering, reducing XSS risk for generated artifacts.
|
Gateway artifact serving now forces active web content types (`text/html`, `application/xhtml+xml`, `image/svg+xml`) to download as attachments instead of inline rendering, reducing XSS risk for generated artifacts.
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|||||||
+12
-88
@@ -88,57 +88,18 @@ make stop # Stop all services
|
|||||||
|
|
||||||
**Backend directory** (for backend development only):
|
**Backend directory** (for backend development only):
|
||||||
```bash
|
```bash
|
||||||
make install # Install backend dependencies
|
make install # Install backend dependencies
|
||||||
make dev # Run Gateway API with reload (port 8001)
|
make dev # Run Gateway API with reload (port 8001)
|
||||||
make gateway # Run Gateway API only (port 8001)
|
make gateway # Run Gateway API only (port 8001)
|
||||||
make test # Run all backend tests
|
make test # Run all backend tests
|
||||||
make test-blocking-io # Run strict Blockbuster runtime gate on tests/blocking_io/
|
make lint # Lint with ruff
|
||||||
make lint # Lint with ruff
|
make format # Format code with ruff
|
||||||
make format # Format code with ruff
|
|
||||||
```
|
```
|
||||||
|
|
||||||
The `detect-blocking-io` target parses `app/`, `packages/harness/deerflow/`,
|
|
||||||
and `scripts/` with AST. By default it reports only blocking IO candidates that
|
|
||||||
are inside async code, reachable from async code in the same file, or reachable
|
|
||||||
from sync-only `AgentMiddleware` before/after hooks that LangGraph can execute
|
|
||||||
on the async graph path. It prints a concise summary and writes complete JSON
|
|
||||||
findings to `.deer-flow/blocking-io-findings.json` at the repository root
|
|
||||||
(both `make detect-blocking-io` from the repo root and `cd backend && make
|
|
||||||
detect-blocking-io` resolve to the same repo-root path). JSON findings include
|
|
||||||
`priority`, `location`, `blocking_call`, `event_loop_exposure`, `reason`, and
|
|
||||||
`code` for model-assisted or manual review. `priority` is a deterministic
|
|
||||||
review ordering from operation type, not proof of a bug. Bare-name same-file
|
|
||||||
calls are resolved by function name, so duplicate helper names in one file can
|
|
||||||
conservatively over-report async reachability. It is intentionally
|
|
||||||
informational and is not run from CI in this round.
|
|
||||||
|
|
||||||
Regression tests related to Docker/provisioner behavior:
|
Regression tests related to Docker/provisioner behavior:
|
||||||
- `tests/test_docker_sandbox_mode_detection.py` (mode detection from `config.yaml`)
|
- `tests/test_docker_sandbox_mode_detection.py` (mode detection from `config.yaml`)
|
||||||
- `tests/test_provisioner_kubeconfig.py` (kubeconfig file/directory handling)
|
- `tests/test_provisioner_kubeconfig.py` (kubeconfig file/directory handling)
|
||||||
|
|
||||||
Blocking-IO runtime gate (`tests/blocking_io/`):
|
|
||||||
- Wraps every item under `tests/blocking_io/` with a strict Blockbuster
|
|
||||||
context scoped to `app.*` and `deerflow.*` (see
|
|
||||||
`tests/support/detectors/blocking_io_runtime.py`). Any sync blocking IO
|
|
||||||
call whose stack passes through DeerFlow business code while running on
|
|
||||||
the asyncio event loop raises `BlockingError` and fails the test.
|
|
||||||
- Regression anchors live there: `test_skills_load.py` (locks the
|
|
||||||
`asyncio.to_thread` offload around `LocalSkillStorage.load_skills`, fix
|
|
||||||
for #1917); `test_sqlite_lifespan.py` (locks the offload around
|
|
||||||
SQLite path resolution plus `ensure_sqlite_parent_dir`, fix for #1912);
|
|
||||||
`test_jsonl_run_event_store.py` (locks `JsonlRunEventStore`'s async
|
|
||||||
API offloading its file IO via `asyncio.to_thread`, fix #3084); and
|
|
||||||
`test_uploads_middleware.py` (locks `UploadsMiddleware.abefore_agent`
|
|
||||||
offloading the uploads-directory scan off the event loop).
|
|
||||||
- `test_gate_smoke.py` is a meta-test asserting the gate actually catches
|
|
||||||
unoffloaded blocking IO and that the `@pytest.mark.allow_blocking_io`
|
|
||||||
opt-out works.
|
|
||||||
- Coverage boundary: the gate only sees code that test execution actually
|
|
||||||
touches. Static AST coverage is a separate concern (out of scope for
|
|
||||||
this PR).
|
|
||||||
- CI: runs on every PR via `.github/workflows/backend-blocking-io-tests.yml`,
|
|
||||||
hard-fail.
|
|
||||||
|
|
||||||
Boundary check (harness → app import firewall):
|
Boundary check (harness → app import firewall):
|
||||||
- `tests/test_harness_boundary.py` — ensures `packages/harness/deerflow/` never imports from `app.*`
|
- `tests/test_harness_boundary.py` — ensures `packages/harness/deerflow/` never imports from `app.*`
|
||||||
|
|
||||||
@@ -223,18 +184,6 @@ Setup: Copy `config.example.yaml` to `config.yaml` in the **project root** direc
|
|||||||
|
|
||||||
**Config Caching**: `get_app_config()` caches the parsed config, but automatically reloads it when the resolved config path changes or the file's mtime increases. This keeps Gateway and LangGraph reads aligned with `config.yaml` edits without requiring a manual process restart.
|
**Config Caching**: `get_app_config()` caches the parsed config, but automatically reloads it when the resolved config path changes or the file's mtime increases. This keeps Gateway and LangGraph reads aligned with `config.yaml` edits without requiring a manual process restart.
|
||||||
|
|
||||||
**Config Hot-Reload Boundary**: Gateway dependencies route through `get_app_config()` on every request, so per-run fields like `models[*].max_tokens`, `summarization.*`, `title.*`, `memory.*`, `subagents.*`, `tools[*]`, and the agent system prompt pick up `config.yaml` edits on the next message. `AppConfig` is intentionally **not** cached on `app.state` — `lifespan()` keeps a local `startup_config` variable for one-shot bootstrap work (logging level, channels, `langgraph_runtime` engines) and passes it explicitly to `langgraph_runtime(app, startup_config)`. Infrastructure fields are **restart-required**:
|
|
||||||
|
|
||||||
| Field | Why a restart is required |
|
|
||||||
|---|---|
|
|
||||||
| `database.*` | `init_engine_from_config()` runs once during `langgraph_runtime()` startup; the SQLAlchemy engine holds the connection pool. |
|
|
||||||
| `checkpointer.*` (including SQLite WAL/journal settings) | `make_checkpointer()` binds the persistent checkpointer once at startup. |
|
|
||||||
| `run_events.*` | `make_run_event_store()` selects memory- vs. SQL-backed implementation at startup. |
|
|
||||||
| `stream_bridge.*` | `make_stream_bridge()` constructs the bridge object once. |
|
|
||||||
| `sandbox.use` | `get_sandbox_provider()` caches the provider singleton (`_default_sandbox_provider`); a new class path takes effect only on next process start. |
|
|
||||||
| `log_level` | `apply_logging_level()` is called only in `app.py` startup; it mutates the root logger's level, and `get_app_config()` returning a fresh `AppConfig` does not retrigger it. |
|
|
||||||
| `channels.*` IM platform credentials | `start_channel_service()` is invoked once during startup; live channels are not rebuilt on config change. |
|
|
||||||
|
|
||||||
Configuration priority:
|
Configuration priority:
|
||||||
1. Explicit `config_path` argument
|
1. Explicit `config_path` argument
|
||||||
2. `DEER_FLOW_CONFIG_PATH` environment variable
|
2. `DEER_FLOW_CONFIG_PATH` environment variable
|
||||||
@@ -276,28 +225,21 @@ CORS is same-origin by default when requests enter through nginx on port 2026. S
|
|||||||
| **Feedback** (`/api/threads/{id}/runs/{rid}/feedback`) | `PUT /` - upsert feedback; `DELETE /` - delete user feedback; `POST /` - create feedback; `GET /` - list feedback; `GET /stats` - aggregate stats; `DELETE /{fid}` - delete specific |
|
| **Feedback** (`/api/threads/{id}/runs/{rid}/feedback`) | `PUT /` - upsert feedback; `DELETE /` - delete user feedback; `POST /` - create feedback; `GET /` - list feedback; `GET /stats` - aggregate stats; `DELETE /{fid}` - delete specific |
|
||||||
| **Runs** (`/api/runs`) | `POST /stream` - stateless run + SSE; `POST /wait` - stateless run + block; `GET /{rid}/messages` - paginated messages by run_id `{data, has_more}` (cursor: `after_seq`/`before_seq`); `GET /{rid}/feedback` - list feedback by run_id |
|
| **Runs** (`/api/runs`) | `POST /stream` - stateless run + SSE; `POST /wait` - stateless run + block; `GET /{rid}/messages` - paginated messages by run_id `{data, has_more}` (cursor: `after_seq`/`before_seq`); `GET /{rid}/feedback` - list feedback by run_id |
|
||||||
|
|
||||||
**RunManager / RunStore contract**:
|
|
||||||
- `RunManager.get()` is async; direct callers must `await` it.
|
|
||||||
- When a persistent `RunStore` is configured, `get()` and `list_by_thread()` hydrate historical runs from the store. In-memory records win for the same `run_id` so task, abort, and stream-control state stays attached to active local runs.
|
|
||||||
- `cancel()` and `create_or_reject(..., multitask_strategy="interrupt"|"rollback")` persist interrupted status through `RunStore.update_status()`, matching normal `set_status()` transitions.
|
|
||||||
- Store-only hydrated runs are readable history. If the current worker has no in-memory task/control state for that run, cancellation APIs can return 409 because this worker cannot stop the task.
|
|
||||||
- `POST /wait` (both thread-scoped and `/api/runs/wait`) drains the stream bridge via `wait_for_run_completion()` instead of bare `await record.task`, so it honours the run's `on_disconnect` setting and cancels the background run on real client disconnect rather than returning a stale checkpoint (issue #3265).
|
|
||||||
|
|
||||||
Proxied through nginx: `/api/langgraph/*` → Gateway LangGraph-compatible runtime, all other `/api/*` → Gateway REST APIs.
|
Proxied through nginx: `/api/langgraph/*` → Gateway LangGraph-compatible runtime, all other `/api/*` → Gateway REST APIs.
|
||||||
|
|
||||||
### Sandbox System (`packages/harness/deerflow/sandbox/`)
|
### Sandbox System (`packages/harness/deerflow/sandbox/`)
|
||||||
|
|
||||||
**Interface**: Abstract `Sandbox` with `execute_command`, `read_file`, `write_file`, `list_dir`
|
**Interface**: Abstract `Sandbox` with `execute_command`, `read_file`, `write_file`, `list_dir`
|
||||||
**Provider Pattern**: `SandboxProvider` with `acquire`, `acquire_async`, `get`, `release` lifecycle. Async agent/tool paths call async sandbox lifecycle hooks so Docker sandbox creation, discovery, cross-process locking, readiness polling, and release stay off the event loop.
|
**Provider Pattern**: `SandboxProvider` with `acquire`, `get`, `release` lifecycle
|
||||||
**Implementations**:
|
**Implementations**:
|
||||||
- `LocalSandboxProvider` - Local filesystem execution. `acquire(thread_id)` returns a per-thread `LocalSandbox` (id `local:{thread_id}`) whose `path_mappings` resolve `/mnt/user-data/{workspace,uploads,outputs}` and `/mnt/acp-workspace` to that thread's host directories, so the public `Sandbox` API honours the `/mnt/user-data` contract uniformly with AIO. `acquire()` / `acquire(None)` keeps the legacy generic singleton (id `local`) for callers without a thread context. Per-thread sandboxes are held in an LRU cache (default 256 entries) guarded by a `threading.Lock`.
|
- `LocalSandboxProvider` - Singleton local filesystem execution with path mappings
|
||||||
- `AioSandboxProvider` (`packages/harness/deerflow/community/`) - Docker-based isolation
|
- `AioSandboxProvider` (`packages/harness/deerflow/community/`) - Docker-based isolation
|
||||||
|
|
||||||
**Virtual Path System**:
|
**Virtual Path System**:
|
||||||
- Agent sees: `/mnt/user-data/{workspace,uploads,outputs}`, `/mnt/skills`
|
- Agent sees: `/mnt/user-data/{workspace,uploads,outputs}`, `/mnt/skills`
|
||||||
- Physical: `backend/.deer-flow/users/{user_id}/threads/{thread_id}/user-data/...`, `deer-flow/skills/`
|
- Physical: `backend/.deer-flow/users/{user_id}/threads/{thread_id}/user-data/...`, `deer-flow/skills/`
|
||||||
- Translation: `LocalSandboxProvider` builds per-thread `PathMapping`s for the user-data prefixes at acquire time; `tools.py` keeps `replace_virtual_path()` / `replace_virtual_paths_in_command()` as a defense-in-depth layer (and for path validation). AIO has the directories volume-mounted at the same virtual paths inside its container, so both implementations accept `/mnt/user-data/...` natively.
|
- Translation: `replace_virtual_path()` / `replace_virtual_paths_in_command()`
|
||||||
- Detection: `is_local_sandbox()` accepts both `sandbox_id == "local"` (legacy / no-thread) and `sandbox_id.startswith("local:")` (per-thread)
|
- Detection: `is_local_sandbox()` checks `sandbox_id == "local"`
|
||||||
|
|
||||||
**Sandbox Tools** (in `packages/harness/deerflow/sandbox/tools.py`):
|
**Sandbox Tools** (in `packages/harness/deerflow/sandbox/tools.py`):
|
||||||
- `bash` - Execute commands with path translation and error handling
|
- `bash` - Execute commands with path translation and error handling
|
||||||
@@ -347,7 +289,7 @@ Proxied through nginx: `/api/langgraph/*` → Gateway LangGraph-compatible runti
|
|||||||
- **Cache invalidation**: Detects config file changes via mtime comparison
|
- **Cache invalidation**: Detects config file changes via mtime comparison
|
||||||
- **Transports**: stdio (command-based), SSE, HTTP
|
- **Transports**: stdio (command-based), SSE, HTTP
|
||||||
- **OAuth (HTTP/SSE)**: Supports token endpoint flows (`client_credentials`, `refresh_token`) with automatic token refresh + Authorization header injection
|
- **OAuth (HTTP/SSE)**: Supports token endpoint flows (`client_credentials`, `refresh_token`) with automatic token refresh + Authorization header injection
|
||||||
- **Runtime updates**: Gateway API saves to extensions_config.json; the Gateway-embedded runtime detects changes via mtime
|
- **Runtime updates**: Gateway API saves to extensions_config.json; LangGraph detects via mtime
|
||||||
|
|
||||||
### Skills System (`packages/harness/deerflow/skills/`)
|
### Skills System (`packages/harness/deerflow/skills/`)
|
||||||
|
|
||||||
@@ -374,7 +316,7 @@ Proxied through nginx: `/api/langgraph/*` → Gateway LangGraph-compatible runti
|
|||||||
|
|
||||||
### IM Channels System (`app/channels/`)
|
### IM Channels System (`app/channels/`)
|
||||||
|
|
||||||
Bridges external messaging platforms (Feishu, Slack, Telegram, DingTalk) to the DeerFlow agent via Gateway's LangGraph-compatible API.
|
Bridges external messaging platforms (Feishu, Slack, Telegram, DingTalk) to the DeerFlow agent via the LangGraph Server.
|
||||||
|
|
||||||
|
|
||||||
**Architecture**: Channels communicate with Gateway through the `langgraph-sdk` HTTP client (same as the frontend), ensuring threads are created and managed server-side. The internal SDK client injects process-local internal auth plus a matching CSRF cookie/header pair so Gateway accepts state-changing thread/run requests from channel workers without relying on browser session cookies.
|
**Architecture**: Channels communicate with Gateway through the `langgraph-sdk` HTTP client (same as the frontend), ensuring threads are created and managed server-side. The internal SDK client injects process-local internal auth plus a matching CSRF cookie/header pair so Gateway accepts state-changing thread/run requests from channel workers without relying on browser session cookies.
|
||||||
@@ -449,24 +391,6 @@ Focused regression coverage for the updater lives in `backend/tests/test_memory_
|
|||||||
- `resolve_variable(path)` - Import module and return variable (e.g., `module.path:variable_name`)
|
- `resolve_variable(path)` - Import module and return variable (e.g., `module.path:variable_name`)
|
||||||
- `resolve_class(path, base_class)` - Import and validate class against base class
|
- `resolve_class(path, base_class)` - Import and validate class against base class
|
||||||
|
|
||||||
### Tracing System (`packages/harness/deerflow/tracing/`)
|
|
||||||
|
|
||||||
LangSmith and Langfuse are both supported. The wiring lives in two layers:
|
|
||||||
|
|
||||||
- `factory.py::build_tracing_callbacks()` — returns the LangChain `CallbackHandler` list for the providers currently enabled via env vars (`LANGSMITH_TRACING`, `LANGFUSE_TRACING`, etc.). The handlers are attached at the **graph invocation root** for in-graph runs (`make_lead_agent` and `DeerFlowClient.stream` both append them to `config["callbacks"]` before invoking the graph) so a single run produces one trace with all node / LLM / tool calls as child spans. Standalone callers — anything that invokes a model outside such a graph (e.g. `MemoryUpdater`) — keep `create_chat_model`'s default `attach_tracing=True`, which falls back to model-level callback attachment.
|
|
||||||
- `metadata.py::build_langfuse_trace_metadata()` — builds the Langfuse-reserved trace attributes for `RunnableConfig.metadata`. The Langfuse v4 `langchain.CallbackHandler` lifts these onto the root trace (see its `_parse_langfuse_trace_attributes`), but only when it sees `on_chain_start(parent_run_id=None)` — which is why the callbacks have to live at the graph root, not the model.
|
|
||||||
|
|
||||||
**Trace-attribute injection points**: both `runtime/runs/worker.py::run_agent` (gateway path) and `client.py::DeerFlowClient.stream` (embedded path) merge the metadata into `config["metadata"]` right before constructing the graph. Caller-supplied keys win via `setdefault`, so an external `session_id` override is preserved. Field mapping:
|
|
||||||
|
|
||||||
| Langfuse field | Source |
|
|
||||||
|-----------------------|----------------------------------------------|
|
|
||||||
| `langfuse_session_id` | LangGraph `thread_id` |
|
|
||||||
| `langfuse_user_id` | `get_effective_user_id()` (`default` in no-auth) |
|
|
||||||
| `langfuse_trace_name` | `RunRecord.assistant_id` / client `agent_name` (defaults to `lead-agent`) |
|
|
||||||
| `langfuse_tags` | `env:<DEER_FLOW_ENV>` + `model:<model_name>` |
|
|
||||||
|
|
||||||
Returns `{}` when Langfuse is not in the enabled providers — LangSmith-only deployments are unaffected. Set `DEER_FLOW_ENV` (or `ENVIRONMENT`) to tag traces by deployment environment. Tests live in `tests/test_tracing_factory.py`, `tests/test_tracing_metadata.py`, `tests/test_worker_langfuse_metadata.py`, and `tests/test_client_langfuse_metadata.py`.
|
|
||||||
|
|
||||||
### Config Schema
|
### Config Schema
|
||||||
|
|
||||||
**`config.yaml`** key sections:
|
**`config.yaml`** key sections:
|
||||||
|
|||||||
+3
-9
@@ -2,16 +2,13 @@ install:
|
|||||||
uv sync
|
uv sync
|
||||||
|
|
||||||
dev:
|
dev:
|
||||||
PYTHONPATH=. PYTHONIOENCODING=utf-8 PYTHONUTF8=1 uv run uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001 --reload
|
PYTHONPATH=. uv run uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001 --reload
|
||||||
|
|
||||||
gateway:
|
gateway:
|
||||||
PYTHONPATH=. PYTHONIOENCODING=utf-8 PYTHONUTF8=1 uv run uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001
|
PYTHONPATH=. uv run uvicorn app.gateway.app:app --host 0.0.0.0 --port 8001
|
||||||
|
|
||||||
test:
|
test:
|
||||||
PYTHONPATH=. PYTHONIOENCODING=utf-8 PYTHONUTF8=1 uv run pytest tests/ -v
|
PYTHONPATH=. uv run pytest tests/ -v
|
||||||
|
|
||||||
test-blocking-io:
|
|
||||||
PYTHONPATH=. PYTHONIOENCODING=utf-8 PYTHONUTF8=1 uv run pytest tests/blocking_io -q --tb=short
|
|
||||||
|
|
||||||
lint:
|
lint:
|
||||||
uvx ruff check .
|
uvx ruff check .
|
||||||
@@ -19,6 +16,3 @@ lint:
|
|||||||
|
|
||||||
format:
|
format:
|
||||||
uvx ruff check . --fix && uvx ruff format .
|
uvx ruff check . --fix && uvx ruff format .
|
||||||
|
|
||||||
detect-blocking-io:
|
|
||||||
@PYTHONPATH=. PYTHONIOENCODING=utf-8 PYTHONUTF8=1 uv run python ../scripts/detect_blocking_io_static.py --output ../.deer-flow/blocking-io-findings.json
|
|
||||||
|
|||||||
+1
-14
@@ -69,7 +69,7 @@ Middlewares execute in strict order, each handling a specific concern:
|
|||||||
Per-thread isolated execution with virtual path translation:
|
Per-thread isolated execution with virtual path translation:
|
||||||
|
|
||||||
- **Abstract interface**: `execute_command`, `read_file`, `write_file`, `list_dir`
|
- **Abstract interface**: `execute_command`, `read_file`, `write_file`, `list_dir`
|
||||||
- **Providers**: `LocalSandboxProvider` (filesystem) and `AioSandboxProvider` (Docker, in community/). Async runtime paths use async sandbox lifecycle hooks so startup, readiness polling, and release do not block the event loop.
|
- **Providers**: `LocalSandboxProvider` (filesystem) and `AioSandboxProvider` (Docker, in community/)
|
||||||
- **Virtual paths**: `/mnt/user-data/{workspace,uploads,outputs}` → thread-specific physical directories
|
- **Virtual paths**: `/mnt/user-data/{workspace,uploads,outputs}` → thread-specific physical directories
|
||||||
- **Skills path**: `/mnt/skills` → `deer-flow/skills/` directory
|
- **Skills path**: `/mnt/skills` → `deer-flow/skills/` directory
|
||||||
- **Skills loading**: Recursively discovers nested `SKILL.md` files under `skills/{public,custom}` and preserves nested container paths
|
- **Skills loading**: Recursively discovers nested `SKILL.md` files under `skills/{public,custom}` and preserves nested container paths
|
||||||
@@ -362,7 +362,6 @@ make dev # Run Gateway API + embedded agent runtime (port 8001)
|
|||||||
make gateway # Run Gateway API without reload (port 8001)
|
make gateway # Run Gateway API without reload (port 8001)
|
||||||
make lint # Run linter (ruff)
|
make lint # Run linter (ruff)
|
||||||
make format # Format code (ruff)
|
make format # Format code (ruff)
|
||||||
make detect-blocking-io # Inventory blocking IO that may block the backend event loop
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Code Style
|
### Code Style
|
||||||
@@ -379,18 +378,6 @@ make detect-blocking-io # Inventory blocking IO that may block the backend even
|
|||||||
uv run pytest
|
uv run pytest
|
||||||
```
|
```
|
||||||
|
|
||||||
`make detect-blocking-io` statically scans backend business code for blocking
|
|
||||||
IO that may run on the backend event loop and is not test-coverage-bound. It
|
|
||||||
prints a concise summary for human review and writes complete JSON findings to
|
|
||||||
`.deer-flow/blocking-io-findings.json` at the repository root (regardless of
|
|
||||||
whether the target is invoked from the repo root or from `backend/`). JSON
|
|
||||||
findings include both broad IO category and review-oriented fields such as
|
|
||||||
`priority`, `location`, `blocking_call`, `event_loop_exposure`, `reason`, and
|
|
||||||
`code`. `priority` is a deterministic review ordering from the operation type,
|
|
||||||
not proof of a bug. Bare-name same-file calls are resolved by function name,
|
|
||||||
so duplicate helper names in one file can conservatively over-report async
|
|
||||||
reachability.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Technology Stack
|
## Technology Stack
|
||||||
|
|||||||
+11
-291
@@ -3,10 +3,8 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import asyncio
|
import asyncio
|
||||||
import json
|
|
||||||
import logging
|
import logging
|
||||||
import threading
|
import threading
|
||||||
from pathlib import Path
|
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
from app.channels.base import Channel
|
from app.channels.base import Channel
|
||||||
@@ -23,12 +21,6 @@ class DiscordChannel(Channel):
|
|||||||
Configuration keys (in ``config.yaml`` under ``channels.discord``):
|
Configuration keys (in ``config.yaml`` under ``channels.discord``):
|
||||||
- ``bot_token``: Discord Bot token.
|
- ``bot_token``: Discord Bot token.
|
||||||
- ``allowed_guilds``: (optional) List of allowed Discord guild IDs. Empty = allow all.
|
- ``allowed_guilds``: (optional) List of allowed Discord guild IDs. Empty = allow all.
|
||||||
- ``mention_only``: (optional) If true, only respond when the bot is mentioned.
|
|
||||||
- ``allowed_channels``: (optional) List of channel IDs where messages are always accepted
|
|
||||||
(even when mention_only is true). Use for channels where you want the bot to respond
|
|
||||||
without mentions. Empty = mention_only applies everywhere.
|
|
||||||
- ``thread_mode``: (optional) If true, group a channel conversation into a thread.
|
|
||||||
Default: same as ``mention_only``.
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
def __init__(self, bus: MessageBus, config: dict[str, Any]) -> None:
|
def __init__(self, bus: MessageBus, config: dict[str, Any]) -> None:
|
||||||
@@ -40,29 +32,6 @@ class DiscordChannel(Channel):
|
|||||||
self._allowed_guilds.add(int(guild_id))
|
self._allowed_guilds.add(int(guild_id))
|
||||||
except (TypeError, ValueError):
|
except (TypeError, ValueError):
|
||||||
continue
|
continue
|
||||||
self._mention_only: bool = bool(config.get("mention_only", False))
|
|
||||||
self._thread_mode: bool = config.get("thread_mode", self._mention_only)
|
|
||||||
self._allowed_channels: set[str] = set()
|
|
||||||
for channel_id in config.get("allowed_channels", []):
|
|
||||||
self._allowed_channels.add(str(channel_id))
|
|
||||||
|
|
||||||
# Session tracking: channel_id -> Discord thread_id (in-memory, persisted to JSON).
|
|
||||||
# Uses a dedicated JSON file separate from ChannelStore, which maps IM
|
|
||||||
# conversations to DeerFlow thread IDs — a different concern.
|
|
||||||
self._active_threads: dict[str, str] = {}
|
|
||||||
# Reverse-lookup set for O(1) thread ID checks (avoids O(n) scan of _active_threads.values()).
|
|
||||||
self._active_thread_ids: set[str] = set()
|
|
||||||
# Lock protecting _active_threads and the JSON file from concurrent access.
|
|
||||||
# _run_client (Discord loop thread) and the main thread both read/write.
|
|
||||||
self._thread_store_lock = threading.Lock()
|
|
||||||
store = config.get("channel_store")
|
|
||||||
if store is not None:
|
|
||||||
self._thread_store_path = store._path.parent / "discord_threads.json"
|
|
||||||
else:
|
|
||||||
self._thread_store_path = Path.home() / ".deer-flow" / "channels" / "discord_threads.json"
|
|
||||||
|
|
||||||
# Typing indicator management
|
|
||||||
self._typing_tasks: dict[str, asyncio.Task] = {}
|
|
||||||
|
|
||||||
self._client = None
|
self._client = None
|
||||||
self._thread: threading.Thread | None = None
|
self._thread: threading.Thread | None = None
|
||||||
@@ -106,56 +75,12 @@ class DiscordChannel(Channel):
|
|||||||
|
|
||||||
self._thread = threading.Thread(target=self._run_client, daemon=True)
|
self._thread = threading.Thread(target=self._run_client, daemon=True)
|
||||||
self._thread.start()
|
self._thread.start()
|
||||||
self._load_active_threads()
|
|
||||||
logger.info("Discord channel started")
|
logger.info("Discord channel started")
|
||||||
|
|
||||||
def _load_active_threads(self) -> None:
|
|
||||||
"""Restore Discord thread mappings from the dedicated JSON file on startup."""
|
|
||||||
with self._thread_store_lock:
|
|
||||||
try:
|
|
||||||
if not self._thread_store_path.exists():
|
|
||||||
logger.debug("[Discord] no thread mappings file at %s", self._thread_store_path)
|
|
||||||
return
|
|
||||||
data = json.loads(self._thread_store_path.read_text())
|
|
||||||
self._active_threads.clear()
|
|
||||||
self._active_thread_ids.clear()
|
|
||||||
for channel_id, thread_id in data.items():
|
|
||||||
self._active_threads[channel_id] = thread_id
|
|
||||||
self._active_thread_ids.add(thread_id)
|
|
||||||
if self._active_threads:
|
|
||||||
logger.info("[Discord] restored %d thread mappings from %s", len(self._active_threads), self._thread_store_path)
|
|
||||||
except Exception:
|
|
||||||
logger.exception("[Discord] failed to load thread mappings")
|
|
||||||
|
|
||||||
def _save_thread(self, channel_id: str, thread_id: str) -> None:
|
|
||||||
"""Persist a Discord thread mapping to the dedicated JSON file."""
|
|
||||||
with self._thread_store_lock:
|
|
||||||
try:
|
|
||||||
data: dict[str, str] = {}
|
|
||||||
if self._thread_store_path.exists():
|
|
||||||
data = json.loads(self._thread_store_path.read_text())
|
|
||||||
old_id = data.get(channel_id)
|
|
||||||
data[channel_id] = thread_id
|
|
||||||
# Update reverse-lookup set
|
|
||||||
if old_id:
|
|
||||||
self._active_thread_ids.discard(old_id)
|
|
||||||
self._active_thread_ids.add(thread_id)
|
|
||||||
self._thread_store_path.parent.mkdir(parents=True, exist_ok=True)
|
|
||||||
self._thread_store_path.write_text(json.dumps(data, indent=2))
|
|
||||||
except Exception:
|
|
||||||
logger.exception("[Discord] failed to save thread mapping for channel %s", channel_id)
|
|
||||||
|
|
||||||
async def stop(self) -> None:
|
async def stop(self) -> None:
|
||||||
self._running = False
|
self._running = False
|
||||||
self.bus.unsubscribe_outbound(self._on_outbound)
|
self.bus.unsubscribe_outbound(self._on_outbound)
|
||||||
|
|
||||||
# Cancel all active typing indicator tasks
|
|
||||||
for target_id, task in list(self._typing_tasks.items()):
|
|
||||||
if not task.done():
|
|
||||||
task.cancel()
|
|
||||||
logger.debug("[Discord] cancelled typing task for target %s", target_id)
|
|
||||||
self._typing_tasks.clear()
|
|
||||||
|
|
||||||
if self._client and self._discord_loop and self._discord_loop.is_running():
|
if self._client and self._discord_loop and self._discord_loop.is_running():
|
||||||
close_future = asyncio.run_coroutine_threadsafe(self._client.close(), self._discord_loop)
|
close_future = asyncio.run_coroutine_threadsafe(self._client.close(), self._discord_loop)
|
||||||
try:
|
try:
|
||||||
@@ -175,10 +100,6 @@ class DiscordChannel(Channel):
|
|||||||
logger.info("Discord channel stopped")
|
logger.info("Discord channel stopped")
|
||||||
|
|
||||||
async def send(self, msg: OutboundMessage) -> None:
|
async def send(self, msg: OutboundMessage) -> None:
|
||||||
# Stop typing indicator once we're sending the response
|
|
||||||
stop_future = asyncio.run_coroutine_threadsafe(self._stop_typing(msg.chat_id, msg.thread_ts), self._discord_loop)
|
|
||||||
await asyncio.wrap_future(stop_future)
|
|
||||||
|
|
||||||
target = await self._resolve_target(msg)
|
target = await self._resolve_target(msg)
|
||||||
if target is None:
|
if target is None:
|
||||||
logger.error("[Discord] target not found for chat_id=%s thread_ts=%s", msg.chat_id, msg.thread_ts)
|
logger.error("[Discord] target not found for chat_id=%s thread_ts=%s", msg.chat_id, msg.thread_ts)
|
||||||
@@ -190,9 +111,6 @@ class DiscordChannel(Channel):
|
|||||||
await asyncio.wrap_future(send_future)
|
await asyncio.wrap_future(send_future)
|
||||||
|
|
||||||
async def send_file(self, msg: OutboundMessage, attachment: ResolvedAttachment) -> bool:
|
async def send_file(self, msg: OutboundMessage, attachment: ResolvedAttachment) -> bool:
|
||||||
stop_future = asyncio.run_coroutine_threadsafe(self._stop_typing(msg.chat_id, msg.thread_ts), self._discord_loop)
|
|
||||||
await asyncio.wrap_future(stop_future)
|
|
||||||
|
|
||||||
target = await self._resolve_target(msg)
|
target = await self._resolve_target(msg)
|
||||||
if target is None:
|
if target is None:
|
||||||
logger.error("[Discord] target not found for file upload chat_id=%s thread_ts=%s", msg.chat_id, msg.thread_ts)
|
logger.error("[Discord] target not found for file upload chat_id=%s thread_ts=%s", msg.chat_id, msg.thread_ts)
|
||||||
@@ -212,41 +130,6 @@ class DiscordChannel(Channel):
|
|||||||
logger.exception("[Discord] failed to upload file: %s", attachment.filename)
|
logger.exception("[Discord] failed to upload file: %s", attachment.filename)
|
||||||
return False
|
return False
|
||||||
|
|
||||||
async def _start_typing(self, channel, chat_id: str, thread_ts: str | None = None) -> None:
|
|
||||||
"""Starts a loop to send periodic typing indicators."""
|
|
||||||
target_id = thread_ts or chat_id
|
|
||||||
if target_id in self._typing_tasks:
|
|
||||||
return # Already typing for this target
|
|
||||||
|
|
||||||
async def _typing_loop():
|
|
||||||
try:
|
|
||||||
while True:
|
|
||||||
try:
|
|
||||||
await channel.trigger_typing()
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
await asyncio.sleep(10)
|
|
||||||
except asyncio.CancelledError:
|
|
||||||
pass
|
|
||||||
|
|
||||||
task = asyncio.create_task(_typing_loop())
|
|
||||||
self._typing_tasks[target_id] = task
|
|
||||||
|
|
||||||
async def _stop_typing(self, chat_id: str, thread_ts: str | None = None) -> None:
|
|
||||||
"""Stops the typing loop for a specific target."""
|
|
||||||
target_id = thread_ts or chat_id
|
|
||||||
task = self._typing_tasks.pop(target_id, None)
|
|
||||||
if task and not task.done():
|
|
||||||
task.cancel()
|
|
||||||
logger.debug("[Discord] stopped typing indicator for target %s", target_id)
|
|
||||||
|
|
||||||
async def _add_reaction(self, message) -> None:
|
|
||||||
"""Add a checkmark reaction to acknowledge the message was received."""
|
|
||||||
try:
|
|
||||||
await message.add_reaction("✅")
|
|
||||||
except Exception:
|
|
||||||
logger.debug("[Discord] failed to add reaction to message %s", message.id, exc_info=True)
|
|
||||||
|
|
||||||
async def _on_message(self, message) -> None:
|
async def _on_message(self, message) -> None:
|
||||||
if not self._running or not self._client:
|
if not self._running or not self._client:
|
||||||
return
|
return
|
||||||
@@ -269,143 +152,15 @@ class DiscordChannel(Channel):
|
|||||||
if self._discord_module is None:
|
if self._discord_module is None:
|
||||||
return
|
return
|
||||||
|
|
||||||
# Determine whether the bot is mentioned in this message
|
|
||||||
user = self._client.user if self._client else None
|
|
||||||
if user:
|
|
||||||
bot_mention = user.mention # <@ID>
|
|
||||||
alt_mention = f"<@!{user.id}>" # <@!ID> (ping variant)
|
|
||||||
standard_mention = f"<@{user.id}>"
|
|
||||||
else:
|
|
||||||
bot_mention = None
|
|
||||||
alt_mention = None
|
|
||||||
standard_mention = ""
|
|
||||||
has_mention = (bot_mention and bot_mention in message.content) or (alt_mention and alt_mention in message.content) or (standard_mention and standard_mention in message.content)
|
|
||||||
|
|
||||||
# Strip mention from text for processing
|
|
||||||
if has_mention:
|
|
||||||
text = text.replace(bot_mention or "", "").replace(alt_mention or "", "").replace(standard_mention or "", "").strip()
|
|
||||||
# Don't return early if text is empty — still process the mention (e.g., create thread)
|
|
||||||
|
|
||||||
# --- Determine thread/channel routing and typing target ---
|
|
||||||
thread_id = None
|
|
||||||
chat_id = None
|
|
||||||
typing_target = None # The Discord object to type into
|
|
||||||
|
|
||||||
if isinstance(message.channel, self._discord_module.Thread):
|
if isinstance(message.channel, self._discord_module.Thread):
|
||||||
# --- Message already inside a thread ---
|
chat_id = str(message.channel.parent_id or message.channel.id)
|
||||||
thread_obj = message.channel
|
thread_id = str(message.channel.id)
|
||||||
thread_id = str(thread_obj.id)
|
|
||||||
chat_id = str(thread_obj.parent_id or thread_obj.id)
|
|
||||||
typing_target = thread_obj
|
|
||||||
|
|
||||||
# If this is a known active thread, process normally
|
|
||||||
if thread_id in self._active_thread_ids:
|
|
||||||
msg_type = InboundMessageType.COMMAND if text.startswith("/") else InboundMessageType.CHAT
|
|
||||||
inbound = self._make_inbound(
|
|
||||||
chat_id=chat_id,
|
|
||||||
user_id=str(message.author.id),
|
|
||||||
text=text,
|
|
||||||
msg_type=msg_type,
|
|
||||||
thread_ts=thread_id,
|
|
||||||
metadata={
|
|
||||||
"guild_id": str(guild.id) if guild else None,
|
|
||||||
"channel_id": str(message.channel.id),
|
|
||||||
"message_id": str(message.id),
|
|
||||||
},
|
|
||||||
)
|
|
||||||
inbound.topic_id = thread_id
|
|
||||||
self._publish(inbound)
|
|
||||||
# Start typing indicator in the thread
|
|
||||||
if typing_target:
|
|
||||||
asyncio.create_task(self._start_typing(typing_target, chat_id, thread_id))
|
|
||||||
asyncio.create_task(self._add_reaction(message))
|
|
||||||
return
|
|
||||||
|
|
||||||
# Thread not tracked (orphaned) — create new thread and handle below
|
|
||||||
logger.debug("[Discord] message in orphaned thread %s, will create new thread", thread_id)
|
|
||||||
thread_id = None
|
|
||||||
typing_target = None
|
|
||||||
|
|
||||||
# At this point we're guaranteed to be in a channel, not a thread
|
|
||||||
# (the Thread case is handled above). Apply mention_only for all
|
|
||||||
# non-thread messages — no special case needed.
|
|
||||||
channel_id = str(message.channel.id)
|
|
||||||
|
|
||||||
# Check if there's an active thread for this channel
|
|
||||||
if channel_id in self._active_threads:
|
|
||||||
# respect mention_only: if enabled, only process messages that mention the bot
|
|
||||||
# (unless the channel is in allowed_channels)
|
|
||||||
# Messages within a thread are always allowed through (continuation).
|
|
||||||
# At this code point we know the message is in a channel, not a thread
|
|
||||||
# (Thread case handled above), so always apply the check.
|
|
||||||
if self._mention_only and not has_mention and channel_id not in self._allowed_channels:
|
|
||||||
logger.debug("[Discord] skipping no-@ message in channel %s (not in thread)", channel_id)
|
|
||||||
return
|
|
||||||
# mention_only + fresh @ → create new thread instead of routing to existing one
|
|
||||||
if self._mention_only and has_mention:
|
|
||||||
thread_obj = await self._create_thread(message)
|
|
||||||
if thread_obj is not None:
|
|
||||||
target_thread_id = str(thread_obj.id)
|
|
||||||
self._active_threads[channel_id] = target_thread_id
|
|
||||||
self._save_thread(channel_id, target_thread_id)
|
|
||||||
thread_id = target_thread_id
|
|
||||||
chat_id = channel_id
|
|
||||||
typing_target = thread_obj
|
|
||||||
logger.info("[Discord] created new thread %s in channel %s on mention (replacing existing thread)", target_thread_id, channel_id)
|
|
||||||
else:
|
|
||||||
logger.info("[Discord] thread creation failed in channel %s, falling back to channel replies", channel_id)
|
|
||||||
thread_id = channel_id
|
|
||||||
chat_id = channel_id
|
|
||||||
typing_target = message.channel
|
|
||||||
else:
|
|
||||||
# Existing session → route to the existing thread
|
|
||||||
target_thread_id = self._active_threads[channel_id]
|
|
||||||
logger.debug("[Discord] routing message in channel %s to existing thread %s", channel_id, target_thread_id)
|
|
||||||
thread_id = target_thread_id
|
|
||||||
chat_id = channel_id
|
|
||||||
typing_target = await self._get_channel_or_thread(target_thread_id)
|
|
||||||
elif self._mention_only and not has_mention and channel_id not in self._allowed_channels:
|
|
||||||
# Not mentioned and not in an allowed channel → skip
|
|
||||||
logger.debug("[Discord] skipping message without mention in channel %s", channel_id)
|
|
||||||
return
|
|
||||||
elif self._mention_only and has_mention:
|
|
||||||
# First mention in this channel → create thread
|
|
||||||
thread_obj = await self._create_thread(message)
|
|
||||||
if thread_obj is not None:
|
|
||||||
target_thread_id = str(thread_obj.id)
|
|
||||||
self._active_threads[channel_id] = target_thread_id
|
|
||||||
self._save_thread(channel_id, target_thread_id)
|
|
||||||
thread_id = target_thread_id
|
|
||||||
chat_id = channel_id
|
|
||||||
typing_target = thread_obj # Type into the new thread
|
|
||||||
logger.info("[Discord] created thread %s in channel %s for user %s", target_thread_id, channel_id, message.author.display_name)
|
|
||||||
else:
|
|
||||||
# Fallback: thread creation failed (disabled/permissions), reply in channel
|
|
||||||
logger.info("[Discord] thread creation failed in channel %s, falling back to channel replies", channel_id)
|
|
||||||
thread_id = channel_id
|
|
||||||
chat_id = channel_id
|
|
||||||
typing_target = message.channel # Type into the channel
|
|
||||||
elif self._thread_mode:
|
|
||||||
# thread_mode but mention_only is False → create thread anyway for conversation grouping
|
|
||||||
thread_obj = await self._create_thread(message)
|
|
||||||
if thread_obj is None:
|
|
||||||
# Thread creation failed (disabled/permissions), fall back to channel replies
|
|
||||||
logger.info("[Discord] thread creation failed in channel %s, falling back to channel replies", channel_id)
|
|
||||||
thread_id = channel_id
|
|
||||||
chat_id = channel_id
|
|
||||||
typing_target = message.channel # Type into the channel
|
|
||||||
else:
|
|
||||||
target_thread_id = str(thread_obj.id)
|
|
||||||
self._active_threads[channel_id] = target_thread_id
|
|
||||||
self._save_thread(channel_id, target_thread_id)
|
|
||||||
thread_id = target_thread_id
|
|
||||||
chat_id = channel_id
|
|
||||||
typing_target = thread_obj # Type into the new thread
|
|
||||||
else:
|
else:
|
||||||
# No threading — reply directly in channel
|
thread = await self._create_thread(message)
|
||||||
thread_id = channel_id
|
if thread is None:
|
||||||
chat_id = channel_id
|
return
|
||||||
typing_target = message.channel # Type into the channel
|
chat_id = str(message.channel.id)
|
||||||
|
thread_id = str(thread.id)
|
||||||
|
|
||||||
msg_type = InboundMessageType.COMMAND if text.startswith("/") else InboundMessageType.CHAT
|
msg_type = InboundMessageType.COMMAND if text.startswith("/") else InboundMessageType.CHAT
|
||||||
inbound = self._make_inbound(
|
inbound = self._make_inbound(
|
||||||
@@ -422,15 +177,6 @@ class DiscordChannel(Channel):
|
|||||||
)
|
)
|
||||||
inbound.topic_id = thread_id
|
inbound.topic_id = thread_id
|
||||||
|
|
||||||
# Start typing indicator in the correct target (thread or channel)
|
|
||||||
if typing_target:
|
|
||||||
asyncio.create_task(self._start_typing(typing_target, chat_id, thread_id))
|
|
||||||
|
|
||||||
self._publish(inbound)
|
|
||||||
asyncio.create_task(self._add_reaction(message))
|
|
||||||
|
|
||||||
def _publish(self, inbound) -> None:
|
|
||||||
"""Publish an inbound message to the main event loop."""
|
|
||||||
if self._main_loop and self._main_loop.is_running():
|
if self._main_loop and self._main_loop.is_running():
|
||||||
future = asyncio.run_coroutine_threadsafe(self.bus.publish_inbound(inbound), self._main_loop)
|
future = asyncio.run_coroutine_threadsafe(self.bus.publish_inbound(inbound), self._main_loop)
|
||||||
future.add_done_callback(lambda f: logger.exception("[Discord] publish_inbound failed", exc_info=f.exception()) if f.exception() else None)
|
future.add_done_callback(lambda f: logger.exception("[Discord] publish_inbound failed", exc_info=f.exception()) if f.exception() else None)
|
||||||
@@ -452,40 +198,14 @@ class DiscordChannel(Channel):
|
|||||||
|
|
||||||
async def _create_thread(self, message):
|
async def _create_thread(self, message):
|
||||||
try:
|
try:
|
||||||
if self._discord_module is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
# Only TextChannel (type 0) and NewsChannel (type 10) support threads
|
|
||||||
channel_type = message.channel.type
|
|
||||||
if channel_type not in (
|
|
||||||
self._discord_module.ChannelType.text,
|
|
||||||
self._discord_module.ChannelType.news,
|
|
||||||
):
|
|
||||||
logger.info(
|
|
||||||
"[Discord] channel type %s (%s) does not support threads",
|
|
||||||
channel_type.value,
|
|
||||||
channel_type.name,
|
|
||||||
)
|
|
||||||
return None
|
|
||||||
|
|
||||||
thread_name = f"deerflow-{message.author.display_name}-{message.id}"[:100]
|
thread_name = f"deerflow-{message.author.display_name}-{message.id}"[:100]
|
||||||
return await message.create_thread(name=thread_name)
|
return await message.create_thread(name=thread_name)
|
||||||
except self._discord_module.errors.HTTPException as exc:
|
|
||||||
if exc.code == 50024:
|
|
||||||
logger.info(
|
|
||||||
"[Discord] cannot create thread in channel %s (error code 50024): %s",
|
|
||||||
message.channel.id,
|
|
||||||
channel_type.name if (channel_type := message.channel.type) else "unknown",
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
logger.exception(
|
|
||||||
"[Discord] failed to create thread for message=%s (HTTPException %s)",
|
|
||||||
message.id,
|
|
||||||
exc.code,
|
|
||||||
)
|
|
||||||
return None
|
|
||||||
except Exception:
|
except Exception:
|
||||||
logger.exception("[Discord] failed to create thread for message=%s (threads may be disabled or missing permissions)", message.id)
|
logger.exception("[Discord] failed to create thread for message=%s (threads may be disabled or missing permissions)", message.id)
|
||||||
|
try:
|
||||||
|
await message.channel.send("Could not create a thread for your message. Please check that threads are enabled in this channel.")
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
return None
|
return None
|
||||||
|
|
||||||
async def _resolve_target(self, msg: OutboundMessage):
|
async def _resolve_target(self, msg: OutboundMessage):
|
||||||
|
|||||||
@@ -146,6 +146,13 @@ def _normalize_custom_agent_name(raw_value: str) -> str:
|
|||||||
return normalized
|
return normalized
|
||||||
|
|
||||||
|
|
||||||
|
def _strip_loop_warning_text(text: str) -> str:
|
||||||
|
"""Remove middleware-authored loop warning lines from display text."""
|
||||||
|
if "[LOOP DETECTED]" not in text:
|
||||||
|
return text
|
||||||
|
return "\n".join(line for line in text.splitlines() if "[LOOP DETECTED]" not in line).strip()
|
||||||
|
|
||||||
|
|
||||||
def _extract_response_text(result: dict | list) -> str:
|
def _extract_response_text(result: dict | list) -> str:
|
||||||
"""Extract the last AI message text from a LangGraph runs.wait result.
|
"""Extract the last AI message text from a LangGraph runs.wait result.
|
||||||
|
|
||||||
@@ -155,6 +162,7 @@ def _extract_response_text(result: dict | list) -> str:
|
|||||||
Handles special cases:
|
Handles special cases:
|
||||||
- Regular AI text responses
|
- Regular AI text responses
|
||||||
- Clarification interrupts (``ask_clarification`` tool messages)
|
- Clarification interrupts (``ask_clarification`` tool messages)
|
||||||
|
- Strips loop-detection warnings attached to tool-call AI messages
|
||||||
"""
|
"""
|
||||||
if isinstance(result, list):
|
if isinstance(result, list):
|
||||||
messages = result
|
messages = result
|
||||||
@@ -173,8 +181,6 @@ def _extract_response_text(result: dict | list) -> str:
|
|||||||
|
|
||||||
# Stop at the last human message — anything before it is a previous turn
|
# Stop at the last human message — anything before it is a previous turn
|
||||||
if msg_type == "human":
|
if msg_type == "human":
|
||||||
if _is_hidden_human_control_message(msg):
|
|
||||||
continue
|
|
||||||
break
|
break
|
||||||
|
|
||||||
# Check for tool messages from ask_clarification (interrupt case)
|
# Check for tool messages from ask_clarification (interrupt case)
|
||||||
@@ -186,7 +192,12 @@ def _extract_response_text(result: dict | list) -> str:
|
|||||||
# Regular AI message with text content
|
# Regular AI message with text content
|
||||||
if msg_type == "ai":
|
if msg_type == "ai":
|
||||||
content = msg.get("content", "")
|
content = msg.get("content", "")
|
||||||
|
has_tool_calls = bool(msg.get("tool_calls"))
|
||||||
if isinstance(content, str) and content:
|
if isinstance(content, str) and content:
|
||||||
|
if has_tool_calls:
|
||||||
|
content = _strip_loop_warning_text(content)
|
||||||
|
if not content:
|
||||||
|
continue
|
||||||
return content
|
return content
|
||||||
# content can be a list of content blocks
|
# content can be a list of content blocks
|
||||||
if isinstance(content, list):
|
if isinstance(content, list):
|
||||||
@@ -197,6 +208,8 @@ def _extract_response_text(result: dict | list) -> str:
|
|||||||
elif isinstance(block, str):
|
elif isinstance(block, str):
|
||||||
parts.append(block)
|
parts.append(block)
|
||||||
text = "".join(parts)
|
text = "".join(parts)
|
||||||
|
if has_tool_calls:
|
||||||
|
text = _strip_loop_warning_text(text)
|
||||||
if text:
|
if text:
|
||||||
return text
|
return text
|
||||||
return ""
|
return ""
|
||||||
@@ -315,8 +328,6 @@ def _extract_artifacts(result: dict | list) -> list[str]:
|
|||||||
continue
|
continue
|
||||||
# Stop at the last human message — anything before it is a previous turn
|
# Stop at the last human message — anything before it is a previous turn
|
||||||
if msg.get("type") == "human":
|
if msg.get("type") == "human":
|
||||||
if _is_hidden_human_control_message(msg):
|
|
||||||
continue
|
|
||||||
break
|
break
|
||||||
# Look for AI messages with present_files tool calls
|
# Look for AI messages with present_files tool calls
|
||||||
if msg.get("type") == "ai":
|
if msg.get("type") == "ai":
|
||||||
@@ -329,18 +340,6 @@ def _extract_artifacts(result: dict | list) -> list[str]:
|
|||||||
return artifacts
|
return artifacts
|
||||||
|
|
||||||
|
|
||||||
def _is_hidden_human_control_message(msg: Mapping[str, Any]) -> bool:
|
|
||||||
"""Return whether a human message is an internal control message hidden from UI."""
|
|
||||||
if msg.get("type") != "human":
|
|
||||||
return False
|
|
||||||
|
|
||||||
additional_kwargs = msg.get("additional_kwargs")
|
|
||||||
if not isinstance(additional_kwargs, Mapping):
|
|
||||||
return False
|
|
||||||
|
|
||||||
return additional_kwargs.get("hide_from_ui") is True
|
|
||||||
|
|
||||||
|
|
||||||
def _format_artifact_text(artifacts: list[str]) -> str:
|
def _format_artifact_text(artifacts: list[str]) -> str:
|
||||||
"""Format artifact paths into a human-readable text block listing filenames."""
|
"""Format artifact paths into a human-readable text block listing filenames."""
|
||||||
import posixpath
|
import posixpath
|
||||||
@@ -788,22 +787,13 @@ class ChannelManager:
|
|||||||
return
|
return
|
||||||
|
|
||||||
logger.info("[Manager] invoking runs.wait(thread_id=%s, text=%r)", thread_id, msg.text[:100])
|
logger.info("[Manager] invoking runs.wait(thread_id=%s, text=%r)", thread_id, msg.text[:100])
|
||||||
try:
|
result = await client.runs.wait(
|
||||||
result = await client.runs.wait(
|
thread_id,
|
||||||
thread_id,
|
assistant_id,
|
||||||
assistant_id,
|
input={"messages": [{"role": "human", "content": msg.text}]},
|
||||||
input={"messages": [{"role": "human", "content": msg.text}]},
|
config=run_config,
|
||||||
config=run_config,
|
context=run_context,
|
||||||
context=run_context,
|
)
|
||||||
multitask_strategy="reject",
|
|
||||||
)
|
|
||||||
except Exception as exc:
|
|
||||||
if _is_thread_busy_error(exc):
|
|
||||||
logger.warning("[Manager] thread busy (concurrent run rejected): thread_id=%s", thread_id)
|
|
||||||
await self._send_error(msg, THREAD_BUSY_MESSAGE)
|
|
||||||
return
|
|
||||||
else:
|
|
||||||
raise
|
|
||||||
|
|
||||||
response_text = _extract_response_text(result)
|
response_text = _extract_response_text(result)
|
||||||
artifacts = _extract_artifacts(result)
|
artifacts = _extract_artifacts(result)
|
||||||
|
|||||||
@@ -167,8 +167,6 @@ class ChannelService:
|
|||||||
return False
|
return False
|
||||||
|
|
||||||
try:
|
try:
|
||||||
config = dict(config)
|
|
||||||
config["channel_store"] = self.store
|
|
||||||
channel = channel_cls(bus=self.bus, config=config)
|
channel = channel_cls(bus=self.bus, config=config)
|
||||||
self._channels[name] = channel
|
self._channels[name] = channel
|
||||||
await channel.start()
|
await channel.start()
|
||||||
|
|||||||
@@ -161,16 +161,10 @@ async def _migrate_orphaned_threads(store, admin_user_id: str) -> int:
|
|||||||
async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
|
async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
|
||||||
"""Application lifespan handler."""
|
"""Application lifespan handler."""
|
||||||
|
|
||||||
# Load config and check necessary environment variables at startup.
|
# Load config and check necessary environment variables at startup
|
||||||
# `startup_config` is a local snapshot used only for one-shot bootstrap
|
|
||||||
# work (logging level, langgraph_runtime engines, channels). Request-time
|
|
||||||
# config resolution always routes through `get_app_config()` in
|
|
||||||
# `app/gateway/deps.py::get_config()` so `config.yaml` edits become
|
|
||||||
# visible without a process restart. We deliberately do NOT cache this
|
|
||||||
# snapshot on `app.state` to keep that contract enforceable.
|
|
||||||
try:
|
try:
|
||||||
startup_config = get_app_config()
|
app.state.config = get_app_config()
|
||||||
apply_logging_level(startup_config.log_level)
|
apply_logging_level(app.state.config.log_level)
|
||||||
logger.info("Configuration loaded successfully")
|
logger.info("Configuration loaded successfully")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
error_msg = f"Failed to load configuration during gateway startup: {e}"
|
error_msg = f"Failed to load configuration during gateway startup: {e}"
|
||||||
@@ -180,7 +174,7 @@ async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
|
|||||||
logger.info(f"Starting API Gateway on {config.host}:{config.port}")
|
logger.info(f"Starting API Gateway on {config.host}:{config.port}")
|
||||||
|
|
||||||
# Initialize LangGraph runtime components (StreamBridge, RunManager, checkpointer, store)
|
# Initialize LangGraph runtime components (StreamBridge, RunManager, checkpointer, store)
|
||||||
async with langgraph_runtime(app, startup_config):
|
async with langgraph_runtime(app):
|
||||||
logger.info("LangGraph runtime initialised")
|
logger.info("LangGraph runtime initialised")
|
||||||
|
|
||||||
# Check admin bootstrap state and migrate orphan threads after admin exists.
|
# Check admin bootstrap state and migrate orphan threads after admin exists.
|
||||||
@@ -191,7 +185,7 @@ async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
|
|||||||
try:
|
try:
|
||||||
from app.channels.service import start_channel_service
|
from app.channels.service import start_channel_service
|
||||||
|
|
||||||
channel_service = await start_channel_service(startup_config)
|
channel_service = await start_channel_service(app.state.config)
|
||||||
logger.info("Channel service started: %s", channel_service.get_status())
|
logger.info("Channel service started: %s", channel_service.get_status())
|
||||||
except Exception:
|
except Exception:
|
||||||
logger.exception("No IM channels configured or channel service failed to start")
|
logger.exception("No IM channels configured or channel service failed to start")
|
||||||
|
|||||||
@@ -8,8 +8,6 @@ from pydantic import BaseModel, Field
|
|||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
_SECRET_FILE = ".jwt_secret"
|
|
||||||
|
|
||||||
|
|
||||||
class AuthConfig(BaseModel):
|
class AuthConfig(BaseModel):
|
||||||
"""JWT and auth-related configuration. Parsed once at startup.
|
"""JWT and auth-related configuration. Parsed once at startup.
|
||||||
@@ -32,32 +30,6 @@ class AuthConfig(BaseModel):
|
|||||||
_auth_config: AuthConfig | None = None
|
_auth_config: AuthConfig | None = None
|
||||||
|
|
||||||
|
|
||||||
def _load_or_create_secret() -> str:
|
|
||||||
"""Load persisted JWT secret from ``{base_dir}/.jwt_secret``, or generate and persist a new one."""
|
|
||||||
from deerflow.config.paths import get_paths
|
|
||||||
|
|
||||||
paths = get_paths()
|
|
||||||
secret_file = paths.base_dir / _SECRET_FILE
|
|
||||||
|
|
||||||
try:
|
|
||||||
if secret_file.exists():
|
|
||||||
secret = secret_file.read_text(encoding="utf-8").strip()
|
|
||||||
if secret:
|
|
||||||
return secret
|
|
||||||
except OSError as exc:
|
|
||||||
raise RuntimeError(f"Failed to read JWT secret from {secret_file}. Set AUTH_JWT_SECRET explicitly or fix DEER_FLOW_HOME/base directory permissions so DeerFlow can read its persisted auth secret.") from exc
|
|
||||||
|
|
||||||
secret = secrets.token_urlsafe(32)
|
|
||||||
try:
|
|
||||||
secret_file.parent.mkdir(parents=True, exist_ok=True)
|
|
||||||
fd = os.open(secret_file, os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600)
|
|
||||||
with os.fdopen(fd, "w", encoding="utf-8") as fh:
|
|
||||||
fh.write(secret)
|
|
||||||
except OSError as exc:
|
|
||||||
raise RuntimeError(f"Failed to persist JWT secret to {secret_file}. Set AUTH_JWT_SECRET explicitly or fix DEER_FLOW_HOME/base directory permissions so DeerFlow can store a stable auth secret.") from exc
|
|
||||||
return secret
|
|
||||||
|
|
||||||
|
|
||||||
def get_auth_config() -> AuthConfig:
|
def get_auth_config() -> AuthConfig:
|
||||||
"""Get the global AuthConfig instance. Parses from env on first call."""
|
"""Get the global AuthConfig instance. Parses from env on first call."""
|
||||||
global _auth_config
|
global _auth_config
|
||||||
@@ -67,11 +39,11 @@ def get_auth_config() -> AuthConfig:
|
|||||||
load_dotenv()
|
load_dotenv()
|
||||||
jwt_secret = os.environ.get("AUTH_JWT_SECRET")
|
jwt_secret = os.environ.get("AUTH_JWT_SECRET")
|
||||||
if not jwt_secret:
|
if not jwt_secret:
|
||||||
jwt_secret = _load_or_create_secret()
|
jwt_secret = secrets.token_urlsafe(32)
|
||||||
os.environ["AUTH_JWT_SECRET"] = jwt_secret
|
os.environ["AUTH_JWT_SECRET"] = jwt_secret
|
||||||
logger.warning(
|
logger.warning(
|
||||||
"⚠ AUTH_JWT_SECRET is not set — using an auto-generated secret "
|
"⚠ AUTH_JWT_SECRET is not set — using an auto-generated ephemeral secret. "
|
||||||
"persisted to .jwt_secret. Sessions will survive restarts. "
|
"Sessions will be invalidated on restart. "
|
||||||
"For production, add AUTH_JWT_SECRET to your .env file: "
|
"For production, add AUTH_JWT_SECRET to your .env file: "
|
||||||
'python -c "import secrets; print(secrets.token_urlsafe(32))"'
|
'python -c "import secrets; print(secrets.token_urlsafe(32))"'
|
||||||
)
|
)
|
||||||
|
|||||||
+17
-104
@@ -3,21 +3,11 @@
|
|||||||
**Getters** (used by routers): raise 503 when a required dependency is
|
**Getters** (used by routers): raise 503 when a required dependency is
|
||||||
missing, except ``get_store`` which returns ``None``.
|
missing, except ``get_store`` which returns ``None``.
|
||||||
|
|
||||||
``AppConfig`` is intentionally *not* cached on ``app.state``. Routers and the
|
|
||||||
run path resolve it through :func:`deerflow.config.app_config.get_app_config`,
|
|
||||||
which performs mtime-based hot reload, so edits to ``config.yaml`` take
|
|
||||||
effect on the next request without a process restart. The engines created in
|
|
||||||
:func:`langgraph_runtime` (stream bridge, persistence, checkpointer, store,
|
|
||||||
run-event store) accept a ``startup_config`` snapshot — they are
|
|
||||||
restart-required by design and stay bound to that snapshot to keep the live
|
|
||||||
process consistent with itself.
|
|
||||||
|
|
||||||
Initialization is handled directly in ``app.py`` via :class:`AsyncExitStack`.
|
Initialization is handled directly in ``app.py`` via :class:`AsyncExitStack`.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import logging
|
|
||||||
from collections.abc import AsyncGenerator, Callable
|
from collections.abc import AsyncGenerator, Callable
|
||||||
from contextlib import AsyncExitStack, asynccontextmanager
|
from contextlib import AsyncExitStack, asynccontextmanager
|
||||||
from typing import TYPE_CHECKING, TypeVar, cast
|
from typing import TYPE_CHECKING, TypeVar, cast
|
||||||
@@ -25,97 +15,36 @@ from typing import TYPE_CHECKING, TypeVar, cast
|
|||||||
from fastapi import FastAPI, HTTPException, Request
|
from fastapi import FastAPI, HTTPException, Request
|
||||||
from langgraph.types import Checkpointer
|
from langgraph.types import Checkpointer
|
||||||
|
|
||||||
from deerflow.config.app_config import AppConfig, get_app_config
|
from deerflow.config.app_config import AppConfig
|
||||||
from deerflow.persistence.feedback import FeedbackRepository
|
from deerflow.persistence.feedback import FeedbackRepository
|
||||||
from deerflow.runtime import RunContext, RunManager, StreamBridge
|
from deerflow.runtime import RunContext, RunManager, StreamBridge
|
||||||
from deerflow.runtime.events.store.base import RunEventStore
|
from deerflow.runtime.events.store.base import RunEventStore
|
||||||
from deerflow.runtime.runs.store.base import RunStore
|
from deerflow.runtime.runs.store.base import RunStore
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
if TYPE_CHECKING:
|
||||||
from app.gateway.auth.local_provider import LocalAuthProvider
|
from app.gateway.auth.local_provider import LocalAuthProvider
|
||||||
from app.gateway.auth.repositories.sqlite import SQLiteUserRepository
|
from app.gateway.auth.repositories.sqlite import SQLiteUserRepository
|
||||||
from deerflow.persistence.thread_meta.base import ThreadMetaStore
|
from deerflow.persistence.thread_meta.base import ThreadMetaStore
|
||||||
from deerflow.runtime import RunRecord
|
|
||||||
|
|
||||||
|
|
||||||
T = TypeVar("T")
|
T = TypeVar("T")
|
||||||
|
|
||||||
|
|
||||||
async def _mark_latest_recovered_threads_error(
|
def get_config(request: Request) -> AppConfig:
|
||||||
run_manager: RunManager,
|
"""Return the app-scoped ``AppConfig`` stored on ``app.state``."""
|
||||||
thread_store: ThreadMetaStore,
|
config = getattr(request.app.state, "config", None)
|
||||||
recovered_runs: list[RunRecord],
|
if config is None:
|
||||||
) -> None:
|
raise HTTPException(status_code=503, detail="Configuration not available")
|
||||||
"""Mark thread status as error only when its newest run was recovered."""
|
return config
|
||||||
recovered_by_thread: dict[str, set[str]] = {}
|
|
||||||
for record in recovered_runs:
|
|
||||||
recovered_by_thread.setdefault(record.thread_id, set()).add(record.run_id)
|
|
||||||
|
|
||||||
for thread_id, recovered_run_ids in recovered_by_thread.items():
|
|
||||||
try:
|
|
||||||
latest_runs = await run_manager.list_by_thread(thread_id, user_id=None, limit=1)
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to find latest run for thread %s during run reconciliation", thread_id, exc_info=True)
|
|
||||||
continue
|
|
||||||
if not latest_runs or latest_runs[0].run_id not in recovered_run_ids:
|
|
||||||
continue
|
|
||||||
try:
|
|
||||||
await thread_store.update_status(thread_id, "error", user_id=None)
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to mark thread %s as error during run reconciliation", thread_id, exc_info=True)
|
|
||||||
|
|
||||||
|
|
||||||
def get_config() -> AppConfig:
|
|
||||||
"""Return the freshest ``AppConfig`` for the current request.
|
|
||||||
|
|
||||||
Routes through :func:`deerflow.config.app_config.get_app_config`, which
|
|
||||||
honours runtime ``ContextVar`` overrides and reloads ``config.yaml`` from
|
|
||||||
disk when its mtime changes. ``AppConfig`` is not cached on ``app.state``
|
|
||||||
at all — the only startup-time snapshot lives as a local
|
|
||||||
``startup_config`` variable inside ``lifespan()`` and is passed
|
|
||||||
explicitly into :func:`langgraph_runtime` for the engines that are
|
|
||||||
restart-required by design. Routing every request through
|
|
||||||
:func:`get_app_config` closes the bytedance/deer-flow issue #3107 BUG-001
|
|
||||||
split-brain where the worker / lead-agent thread saw a stale startup
|
|
||||||
snapshot.
|
|
||||||
|
|
||||||
Any failure to materialise the config (missing file, permission denied,
|
|
||||||
YAML parse error, validation error) is reported as 503 — semantically
|
|
||||||
"the gateway cannot serve requests without a usable configuration" — and
|
|
||||||
logged with the original exception so operators have something to debug.
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
return get_app_config()
|
|
||||||
except Exception as exc: # noqa: BLE001 - request boundary: log and degrade gracefully
|
|
||||||
logger.exception("Failed to load AppConfig at request time")
|
|
||||||
raise HTTPException(status_code=503, detail="Configuration not available") from exc
|
|
||||||
|
|
||||||
|
|
||||||
@asynccontextmanager
|
@asynccontextmanager
|
||||||
async def langgraph_runtime(app: FastAPI, startup_config: AppConfig) -> AsyncGenerator[None, None]:
|
async def langgraph_runtime(app: FastAPI) -> AsyncGenerator[None, None]:
|
||||||
"""Bootstrap and tear down all LangGraph runtime singletons.
|
"""Bootstrap and tear down all LangGraph runtime singletons.
|
||||||
|
|
||||||
``startup_config`` is the ``AppConfig`` snapshot taken once during
|
|
||||||
``lifespan()`` for one-shot infrastructure bootstrap. The engines and
|
|
||||||
stores constructed here (stream bridge, persistence engine, checkpointer,
|
|
||||||
store, run-event store) are restart-required by design — they hold live
|
|
||||||
connections, file handles, or singleton providers — so they bind to this
|
|
||||||
snapshot and survive across `config.yaml` edits. Request-time consumers
|
|
||||||
must still go through :func:`get_config` for any field that should be
|
|
||||||
hot-reloadable. See ``backend/CLAUDE.md`` "Config Hot-Reload Boundary".
|
|
||||||
|
|
||||||
The matching ``run_events_config`` is frozen onto ``app.state`` so
|
|
||||||
:func:`get_run_context` pairs a freshly-loaded ``AppConfig`` with the
|
|
||||||
*startup-time* run-events configuration the underlying ``event_store``
|
|
||||||
was built from — otherwise the runtime could end up combining a live
|
|
||||||
new ``run_events_config`` with an event store still bound to the
|
|
||||||
previous backend.
|
|
||||||
|
|
||||||
Usage in ``app.py``::
|
Usage in ``app.py``::
|
||||||
|
|
||||||
async with langgraph_runtime(app, startup_config):
|
async with langgraph_runtime(app):
|
||||||
yield
|
yield
|
||||||
"""
|
"""
|
||||||
from deerflow.persistence.engine import close_engine, get_session_factory, init_engine_from_config
|
from deerflow.persistence.engine import close_engine, get_session_factory, init_engine_from_config
|
||||||
@@ -124,7 +53,9 @@ async def langgraph_runtime(app: FastAPI, startup_config: AppConfig) -> AsyncGen
|
|||||||
from deerflow.runtime.events.store import make_run_event_store
|
from deerflow.runtime.events.store import make_run_event_store
|
||||||
|
|
||||||
async with AsyncExitStack() as stack:
|
async with AsyncExitStack() as stack:
|
||||||
config = startup_config
|
config = getattr(app.state, "config", None)
|
||||||
|
if config is None:
|
||||||
|
raise RuntimeError("langgraph_runtime() requires app.state.config to be initialized")
|
||||||
|
|
||||||
app.state.stream_bridge = await stack.enter_async_context(make_stream_bridge(config))
|
app.state.stream_bridge = await stack.enter_async_context(make_stream_bridge(config))
|
||||||
|
|
||||||
@@ -153,26 +84,12 @@ async def langgraph_runtime(app: FastAPI, startup_config: AppConfig) -> AsyncGen
|
|||||||
|
|
||||||
app.state.thread_store = make_thread_store(sf, app.state.store)
|
app.state.thread_store = make_thread_store(sf, app.state.store)
|
||||||
|
|
||||||
# Run event store. The store and the matching ``run_events_config`` are
|
# Run event store (has its own factory with config-driven backend selection)
|
||||||
# both frozen at startup so ``get_run_context`` does not combine a
|
|
||||||
# freshly-reloaded ``AppConfig.run_events`` with a store still bound to
|
|
||||||
# the previous backend.
|
|
||||||
run_events_config = getattr(config, "run_events", None)
|
run_events_config = getattr(config, "run_events", None)
|
||||||
app.state.run_events_config = run_events_config
|
|
||||||
app.state.run_event_store = make_run_event_store(run_events_config)
|
app.state.run_event_store = make_run_event_store(run_events_config)
|
||||||
|
|
||||||
# RunManager with store backing for persistence
|
# RunManager with store backing for persistence
|
||||||
app.state.run_manager = RunManager(store=app.state.run_store)
|
app.state.run_manager = RunManager(store=app.state.run_store)
|
||||||
if getattr(config.database, "backend", None) == "sqlite":
|
|
||||||
from deerflow.utils.time import now_iso
|
|
||||||
|
|
||||||
# Startup-only recovery: clean shutdowns return no active rows and
|
|
||||||
# the thread-status update below becomes a no-op.
|
|
||||||
recovered_runs = await app.state.run_manager.reconcile_orphaned_inflight_runs(
|
|
||||||
error="Gateway restarted before this run reached a durable final state.",
|
|
||||||
before=now_iso(),
|
|
||||||
)
|
|
||||||
await _mark_latest_recovered_threads_error(app.state.run_manager, app.state.thread_store, recovered_runs)
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
yield
|
yield
|
||||||
@@ -222,20 +139,16 @@ def get_thread_store(request: Request) -> ThreadMetaStore:
|
|||||||
def get_run_context(request: Request) -> RunContext:
|
def get_run_context(request: Request) -> RunContext:
|
||||||
"""Build a :class:`RunContext` from ``app.state`` singletons.
|
"""Build a :class:`RunContext` from ``app.state`` singletons.
|
||||||
|
|
||||||
Returns a *base* context with infrastructure dependencies. The
|
Returns a *base* context with infrastructure dependencies.
|
||||||
``app_config`` field is resolved live so per-run fields (e.g.
|
|
||||||
``models[*].max_tokens``) follow ``config.yaml`` edits; the
|
|
||||||
``event_store`` / ``run_events_config`` pair stays frozen to the snapshot
|
|
||||||
captured in :func:`langgraph_runtime` so callers never see a store bound
|
|
||||||
to one backend paired with a config pointing at another.
|
|
||||||
"""
|
"""
|
||||||
|
config = get_config(request)
|
||||||
return RunContext(
|
return RunContext(
|
||||||
checkpointer=get_checkpointer(request),
|
checkpointer=get_checkpointer(request),
|
||||||
store=get_store(request),
|
store=get_store(request),
|
||||||
event_store=get_run_event_store(request),
|
event_store=get_run_event_store(request),
|
||||||
run_events_config=getattr(request.app.state, "run_events_config", None),
|
run_events_config=getattr(config, "run_events", None),
|
||||||
thread_store=get_thread_store(request),
|
thread_store=get_thread_store(request),
|
||||||
app_config=get_config(),
|
app_config=config,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -1,34 +1,23 @@
|
|||||||
"""Authentication for trusted Gateway internal callers."""
|
"""Process-local authentication for Gateway internal callers."""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import os
|
|
||||||
import secrets
|
import secrets
|
||||||
from types import SimpleNamespace
|
from types import SimpleNamespace
|
||||||
|
|
||||||
from deerflow.runtime.user_context import DEFAULT_USER_ID
|
from deerflow.runtime.user_context import DEFAULT_USER_ID
|
||||||
|
|
||||||
INTERNAL_AUTH_HEADER_NAME = "X-DeerFlow-Internal-Token"
|
INTERNAL_AUTH_HEADER_NAME = "X-DeerFlow-Internal-Token"
|
||||||
INTERNAL_AUTH_ENV_VAR = "DEER_FLOW_INTERNAL_AUTH_TOKEN"
|
_INTERNAL_AUTH_TOKEN = secrets.token_urlsafe(32)
|
||||||
|
|
||||||
|
|
||||||
def _load_internal_auth_token() -> str:
|
|
||||||
token = os.environ.get(INTERNAL_AUTH_ENV_VAR)
|
|
||||||
if token:
|
|
||||||
return token
|
|
||||||
return secrets.token_urlsafe(32)
|
|
||||||
|
|
||||||
|
|
||||||
_INTERNAL_AUTH_TOKEN = _load_internal_auth_token()
|
|
||||||
|
|
||||||
|
|
||||||
def create_internal_auth_headers() -> dict[str, str]:
|
def create_internal_auth_headers() -> dict[str, str]:
|
||||||
"""Return headers that authenticate trusted Gateway internal calls."""
|
"""Return headers that authenticate same-process Gateway internal calls."""
|
||||||
return {INTERNAL_AUTH_HEADER_NAME: _INTERNAL_AUTH_TOKEN}
|
return {INTERNAL_AUTH_HEADER_NAME: _INTERNAL_AUTH_TOKEN}
|
||||||
|
|
||||||
|
|
||||||
def is_valid_internal_auth_token(token: str | None) -> bool:
|
def is_valid_internal_auth_token(token: str | None) -> bool:
|
||||||
"""Return True when *token* matches this Gateway worker's internal token."""
|
"""Return True when *token* matches the process-local internal token."""
|
||||||
return bool(token) and secrets.compare_digest(token, _INTERNAL_AUTH_TOKEN)
|
return bool(token) and secrets.compare_digest(token, _INTERNAL_AUTH_TOKEN)
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -20,9 +20,6 @@ ACTIVE_CONTENT_MIME_TYPES = {
|
|||||||
"image/svg+xml",
|
"image/svg+xml",
|
||||||
}
|
}
|
||||||
|
|
||||||
MAX_SKILL_ARCHIVE_MEMBER_BYTES = 16 * 1024 * 1024
|
|
||||||
_SKILL_ARCHIVE_READ_CHUNK_SIZE = 64 * 1024
|
|
||||||
|
|
||||||
|
|
||||||
def _build_content_disposition(disposition_type: str, filename: str) -> str:
|
def _build_content_disposition(disposition_type: str, filename: str) -> str:
|
||||||
"""Build an RFC 5987 encoded Content-Disposition header value."""
|
"""Build an RFC 5987 encoded Content-Disposition header value."""
|
||||||
@@ -47,22 +44,6 @@ def is_text_file_by_content(path: Path, sample_size: int = 8192) -> bool:
|
|||||||
return False
|
return False
|
||||||
|
|
||||||
|
|
||||||
def _read_skill_archive_member(zip_ref: zipfile.ZipFile, info: zipfile.ZipInfo) -> bytes:
|
|
||||||
"""Read a .skill archive member while enforcing an uncompressed size cap."""
|
|
||||||
if info.file_size > MAX_SKILL_ARCHIVE_MEMBER_BYTES:
|
|
||||||
raise HTTPException(status_code=413, detail="Skill archive member is too large to preview")
|
|
||||||
|
|
||||||
chunks: list[bytes] = []
|
|
||||||
total_read = 0
|
|
||||||
with zip_ref.open(info, "r") as src:
|
|
||||||
while chunk := src.read(_SKILL_ARCHIVE_READ_CHUNK_SIZE):
|
|
||||||
total_read += len(chunk)
|
|
||||||
if total_read > MAX_SKILL_ARCHIVE_MEMBER_BYTES:
|
|
||||||
raise HTTPException(status_code=413, detail="Skill archive member is too large to preview")
|
|
||||||
chunks.append(chunk)
|
|
||||||
return b"".join(chunks)
|
|
||||||
|
|
||||||
|
|
||||||
def _extract_file_from_skill_archive(zip_path: Path, internal_path: str) -> bytes | None:
|
def _extract_file_from_skill_archive(zip_path: Path, internal_path: str) -> bytes | None:
|
||||||
"""Extract a file from a .skill ZIP archive.
|
"""Extract a file from a .skill ZIP archive.
|
||||||
|
|
||||||
@@ -79,16 +60,16 @@ def _extract_file_from_skill_archive(zip_path: Path, internal_path: str) -> byte
|
|||||||
try:
|
try:
|
||||||
with zipfile.ZipFile(zip_path, "r") as zip_ref:
|
with zipfile.ZipFile(zip_path, "r") as zip_ref:
|
||||||
# List all files in the archive
|
# List all files in the archive
|
||||||
infos_by_name = {info.filename: info for info in zip_ref.infolist()}
|
namelist = zip_ref.namelist()
|
||||||
|
|
||||||
# Try direct path first
|
# Try direct path first
|
||||||
if internal_path in infos_by_name:
|
if internal_path in namelist:
|
||||||
return _read_skill_archive_member(zip_ref, infos_by_name[internal_path])
|
return zip_ref.read(internal_path)
|
||||||
|
|
||||||
# Try with any top-level directory prefix (e.g., "skill-name/SKILL.md")
|
# Try with any top-level directory prefix (e.g., "skill-name/SKILL.md")
|
||||||
for name, info in infos_by_name.items():
|
for name in namelist:
|
||||||
if name.endswith("/" + internal_path) or name == internal_path:
|
if name.endswith("/" + internal_path) or name == internal_path:
|
||||||
return _read_skill_archive_member(zip_ref, info)
|
return zip_ref.read(name)
|
||||||
|
|
||||||
# Not found
|
# Not found
|
||||||
return None
|
return None
|
||||||
|
|||||||
@@ -1,6 +1,5 @@
|
|||||||
"""Authentication endpoints."""
|
"""Authentication endpoints."""
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import logging
|
import logging
|
||||||
import os
|
import os
|
||||||
import time
|
import time
|
||||||
@@ -383,15 +382,9 @@ async def get_me(request: Request):
|
|||||||
return UserResponse(id=str(user.id), email=user.email, system_role=user.system_role, needs_setup=user.needs_setup)
|
return UserResponse(id=str(user.id), email=user.email, system_role=user.system_role, needs_setup=user.needs_setup)
|
||||||
|
|
||||||
|
|
||||||
# Per-IP cache: ip → (timestamp, result_dict).
|
_SETUP_STATUS_COOLDOWN: dict[str, float] = {}
|
||||||
# Returns the cached result within the TTL instead of 429, because
|
_SETUP_STATUS_COOLDOWN_SECONDS = 60
|
||||||
# the answer (whether an admin exists) rarely changes and returning
|
|
||||||
# 429 breaks multi-tab / post-restart reconnection storms.
|
|
||||||
_SETUP_STATUS_CACHE: dict[str, tuple[float, dict]] = {}
|
|
||||||
_SETUP_STATUS_CACHE_TTL_SECONDS = 60
|
|
||||||
_MAX_TRACKED_SETUP_STATUS_IPS = 10000
|
_MAX_TRACKED_SETUP_STATUS_IPS = 10000
|
||||||
_SETUP_STATUS_INFLIGHT: dict[str, asyncio.Task[dict]] = {}
|
|
||||||
_SETUP_STATUS_INFLIGHT_GUARD = asyncio.Lock()
|
|
||||||
|
|
||||||
|
|
||||||
@router.get("/setup-status")
|
@router.get("/setup-status")
|
||||||
@@ -399,56 +392,29 @@ async def setup_status(request: Request):
|
|||||||
"""Check if an admin account exists. Returns needs_setup=True when no admin exists."""
|
"""Check if an admin account exists. Returns needs_setup=True when no admin exists."""
|
||||||
client_ip = _get_client_ip(request)
|
client_ip = _get_client_ip(request)
|
||||||
now = time.time()
|
now = time.time()
|
||||||
|
last_check = _SETUP_STATUS_COOLDOWN.get(client_ip, 0)
|
||||||
# Return cached result when within TTL — avoids 429 on multi-tab reconnection.
|
elapsed = now - last_check
|
||||||
cached = _SETUP_STATUS_CACHE.get(client_ip)
|
if elapsed < _SETUP_STATUS_COOLDOWN_SECONDS:
|
||||||
if cached is not None:
|
retry_after = max(1, int(_SETUP_STATUS_COOLDOWN_SECONDS - elapsed))
|
||||||
cached_time, cached_result = cached
|
raise HTTPException(
|
||||||
if now - cached_time < _SETUP_STATUS_CACHE_TTL_SECONDS:
|
status_code=status.HTTP_429_TOO_MANY_REQUESTS,
|
||||||
return cached_result
|
detail="Setup status check is rate limited",
|
||||||
|
headers={"Retry-After": str(retry_after)},
|
||||||
async with _SETUP_STATUS_INFLIGHT_GUARD:
|
)
|
||||||
# Recheck cache after waiting for the inflight guard.
|
# Evict stale entries when dict grows too large to bound memory usage.
|
||||||
now = time.time()
|
if len(_SETUP_STATUS_COOLDOWN) >= _MAX_TRACKED_SETUP_STATUS_IPS:
|
||||||
cached = _SETUP_STATUS_CACHE.get(client_ip)
|
cutoff = now - _SETUP_STATUS_COOLDOWN_SECONDS
|
||||||
if cached is not None:
|
stale = [k for k, t in _SETUP_STATUS_COOLDOWN.items() if t < cutoff]
|
||||||
cached_time, cached_result = cached
|
for k in stale:
|
||||||
if now - cached_time < _SETUP_STATUS_CACHE_TTL_SECONDS:
|
del _SETUP_STATUS_COOLDOWN[k]
|
||||||
return cached_result
|
# If still too large after evicting expired entries, remove oldest half.
|
||||||
|
if len(_SETUP_STATUS_COOLDOWN) >= _MAX_TRACKED_SETUP_STATUS_IPS:
|
||||||
task = _SETUP_STATUS_INFLIGHT.get(client_ip)
|
by_time = sorted(_SETUP_STATUS_COOLDOWN.items(), key=lambda kv: kv[1])
|
||||||
if task is None:
|
for k, _ in by_time[: len(by_time) // 2]:
|
||||||
# Evict stale entries when dict grows too large to bound memory usage.
|
del _SETUP_STATUS_COOLDOWN[k]
|
||||||
if len(_SETUP_STATUS_CACHE) >= _MAX_TRACKED_SETUP_STATUS_IPS:
|
_SETUP_STATUS_COOLDOWN[client_ip] = now
|
||||||
cutoff = now - _SETUP_STATUS_CACHE_TTL_SECONDS
|
admin_count = await get_local_provider().count_admin_users()
|
||||||
stale = [k for k, (t, _) in _SETUP_STATUS_CACHE.items() if t < cutoff]
|
return {"needs_setup": admin_count == 0}
|
||||||
for k in stale:
|
|
||||||
del _SETUP_STATUS_CACHE[k]
|
|
||||||
if len(_SETUP_STATUS_CACHE) >= _MAX_TRACKED_SETUP_STATUS_IPS:
|
|
||||||
by_time = sorted(_SETUP_STATUS_CACHE.items(), key=lambda entry: entry[1][0])
|
|
||||||
for k, _ in by_time[: len(by_time) // 2]:
|
|
||||||
del _SETUP_STATUS_CACHE[k]
|
|
||||||
|
|
||||||
async def _compute_setup_status() -> dict:
|
|
||||||
admin_count = await get_local_provider().count_admin_users()
|
|
||||||
return {"needs_setup": admin_count == 0}
|
|
||||||
|
|
||||||
task = asyncio.create_task(_compute_setup_status())
|
|
||||||
_SETUP_STATUS_INFLIGHT[client_ip] = task
|
|
||||||
|
|
||||||
try:
|
|
||||||
result = await task
|
|
||||||
finally:
|
|
||||||
async with _SETUP_STATUS_INFLIGHT_GUARD:
|
|
||||||
if _SETUP_STATUS_INFLIGHT.get(client_ip) is task:
|
|
||||||
del _SETUP_STATUS_INFLIGHT[client_ip]
|
|
||||||
|
|
||||||
# Cache only the stable "initialized" result to avoid stale setup redirects.
|
|
||||||
if result["needs_setup"] is False:
|
|
||||||
_SETUP_STATUS_CACHE[client_ip] = (time.time(), result)
|
|
||||||
else:
|
|
||||||
_SETUP_STATUS_CACHE.pop(client_ip, None)
|
|
||||||
return result
|
|
||||||
|
|
||||||
|
|
||||||
class InitializeAdminRequest(BaseModel):
|
class InitializeAdminRequest(BaseModel):
|
||||||
|
|||||||
@@ -63,99 +63,6 @@ class McpConfigUpdateRequest(BaseModel):
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
_MASKED_VALUE = "***"
|
|
||||||
|
|
||||||
|
|
||||||
def _mask_server_config(server: McpServerConfigResponse) -> McpServerConfigResponse:
|
|
||||||
"""Return a copy of server config with sensitive fields masked.
|
|
||||||
|
|
||||||
Masks env values, header values, and removes OAuth secrets so they
|
|
||||||
are not exposed through the GET API endpoint.
|
|
||||||
"""
|
|
||||||
masked_env = {k: _MASKED_VALUE for k in server.env}
|
|
||||||
masked_headers = {k: _MASKED_VALUE for k in server.headers}
|
|
||||||
masked_oauth = None
|
|
||||||
if server.oauth is not None:
|
|
||||||
masked_oauth = server.oauth.model_copy(
|
|
||||||
update={
|
|
||||||
"client_secret": None,
|
|
||||||
"refresh_token": None,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
return server.model_copy(
|
|
||||||
update={
|
|
||||||
"env": masked_env,
|
|
||||||
"headers": masked_headers,
|
|
||||||
"oauth": masked_oauth,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _merge_preserving_secrets(
|
|
||||||
incoming: McpServerConfigResponse,
|
|
||||||
existing: McpServerConfigResponse,
|
|
||||||
) -> McpServerConfigResponse:
|
|
||||||
"""Merge incoming config with existing, preserving secrets masked by GET.
|
|
||||||
|
|
||||||
When the frontend toggles ``enabled`` it round-trips the full config:
|
|
||||||
GET (masked) → modify enabled → PUT (masked values sent back).
|
|
||||||
This function ensures masked values (``***``) are replaced with the
|
|
||||||
real secrets from the current on-disk config.
|
|
||||||
|
|
||||||
``***`` is only accepted for keys that already exist in *existing*.
|
|
||||||
New keys must provide a real value.
|
|
||||||
|
|
||||||
For OAuth secrets, ``None`` means "preserve the existing stored value"
|
|
||||||
so masked GET responses can be safely round-tripped. To explicitly clear
|
|
||||||
a stored secret, clients may send an empty string, which is converted
|
|
||||||
to ``None`` before persisting.
|
|
||||||
"""
|
|
||||||
merged_env = {}
|
|
||||||
for k, v in incoming.env.items():
|
|
||||||
if v == _MASKED_VALUE:
|
|
||||||
if k in existing.env:
|
|
||||||
merged_env[k] = existing.env[k]
|
|
||||||
else:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=400,
|
|
||||||
detail=f"Cannot set env key '{k}' to masked value '***'; provide a real value.",
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
merged_env[k] = v
|
|
||||||
|
|
||||||
merged_headers = {}
|
|
||||||
for k, v in incoming.headers.items():
|
|
||||||
if v == _MASKED_VALUE:
|
|
||||||
if k in existing.headers:
|
|
||||||
merged_headers[k] = existing.headers[k]
|
|
||||||
else:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=400,
|
|
||||||
detail=f"Cannot set header '{k}' to masked value '***'; provide a real value.",
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
merged_headers[k] = v
|
|
||||||
|
|
||||||
merged_oauth = incoming.oauth
|
|
||||||
if incoming.oauth is not None and existing.oauth is not None:
|
|
||||||
# None = preserve (masked round-trip), "" = explicitly clear, else = new value
|
|
||||||
merged_client_secret = existing.oauth.client_secret if incoming.oauth.client_secret is None else (None if incoming.oauth.client_secret == "" else incoming.oauth.client_secret)
|
|
||||||
merged_refresh_token = existing.oauth.refresh_token if incoming.oauth.refresh_token is None else (None if incoming.oauth.refresh_token == "" else incoming.oauth.refresh_token)
|
|
||||||
merged_oauth = incoming.oauth.model_copy(
|
|
||||||
update={
|
|
||||||
"client_secret": merged_client_secret,
|
|
||||||
"refresh_token": merged_refresh_token,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
return incoming.model_copy(
|
|
||||||
update={
|
|
||||||
"env": merged_env,
|
|
||||||
"headers": merged_headers,
|
|
||||||
"oauth": merged_oauth,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
@router.get(
|
@router.get(
|
||||||
"/mcp/config",
|
"/mcp/config",
|
||||||
response_model=McpConfigResponse,
|
response_model=McpConfigResponse,
|
||||||
@@ -176,7 +83,7 @@ async def get_mcp_configuration() -> McpConfigResponse:
|
|||||||
"enabled": true,
|
"enabled": true,
|
||||||
"command": "npx",
|
"command": "npx",
|
||||||
"args": ["-y", "@modelcontextprotocol/server-github"],
|
"args": ["-y", "@modelcontextprotocol/server-github"],
|
||||||
"env": {"GITHUB_TOKEN": "***"},
|
"env": {"GITHUB_TOKEN": "ghp_xxx"},
|
||||||
"description": "GitHub MCP server for repository operations"
|
"description": "GitHub MCP server for repository operations"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -185,8 +92,7 @@ async def get_mcp_configuration() -> McpConfigResponse:
|
|||||||
"""
|
"""
|
||||||
config = get_extensions_config()
|
config = get_extensions_config()
|
||||||
|
|
||||||
servers = {name: _mask_server_config(McpServerConfigResponse(**server.model_dump())) for name, server in config.mcp_servers.items()}
|
return McpConfigResponse(mcp_servers={name: McpServerConfigResponse(**server.model_dump()) for name, server in config.mcp_servers.items()})
|
||||||
return McpConfigResponse(mcp_servers=servers)
|
|
||||||
|
|
||||||
|
|
||||||
@router.put(
|
@router.put(
|
||||||
@@ -236,39 +142,14 @@ async def update_mcp_configuration(request: McpConfigUpdateRequest) -> McpConfig
|
|||||||
config_path = Path.cwd().parent / "extensions_config.json"
|
config_path = Path.cwd().parent / "extensions_config.json"
|
||||||
logger.info(f"No existing extensions config found. Creating new config at: {config_path}")
|
logger.info(f"No existing extensions config found. Creating new config at: {config_path}")
|
||||||
|
|
||||||
# Load current config to preserve skills
|
# Load current config to preserve skills configuration
|
||||||
current_config = get_extensions_config()
|
current_config = get_extensions_config()
|
||||||
|
|
||||||
# Load raw (un-resolved) JSON from disk to use as the merge source.
|
# Convert request to dict format for JSON serialization
|
||||||
# This preserves $VAR placeholders in env values and top-level keys
|
config_data = {
|
||||||
# like mcpInterceptors that would otherwise be lost.
|
"mcpServers": {name: server.model_dump() for name, server in request.mcp_servers.items()},
|
||||||
raw_servers: dict[str, dict] = {}
|
"skills": {name: {"enabled": skill.enabled} for name, skill in current_config.skills.items()},
|
||||||
raw_other_keys: dict = {}
|
}
|
||||||
if config_path is not None and config_path.exists():
|
|
||||||
with open(config_path, encoding="utf-8") as f:
|
|
||||||
raw_data = json.load(f)
|
|
||||||
raw_servers = raw_data.get("mcpServers", {})
|
|
||||||
# Preserve any top-level keys beyond mcpServers/skills
|
|
||||||
for key, value in raw_data.items():
|
|
||||||
if key not in ("mcpServers", "skills"):
|
|
||||||
raw_other_keys[key] = value
|
|
||||||
|
|
||||||
# Merge incoming server configs with raw on-disk secrets
|
|
||||||
merged_servers: dict[str, McpServerConfigResponse] = {}
|
|
||||||
for name, incoming in request.mcp_servers.items():
|
|
||||||
raw_server = raw_servers.get(name)
|
|
||||||
if raw_server is not None:
|
|
||||||
merged_servers[name] = _merge_preserving_secrets(
|
|
||||||
incoming,
|
|
||||||
McpServerConfigResponse(**raw_server),
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
merged_servers[name] = incoming
|
|
||||||
|
|
||||||
# Build config data preserving all top-level keys from the original file
|
|
||||||
config_data = dict(raw_other_keys)
|
|
||||||
config_data["mcpServers"] = {name: server.model_dump() for name, server in merged_servers.items()}
|
|
||||||
config_data["skills"] = {name: {"enabled": skill.enabled} for name, skill in current_config.skills.items()}
|
|
||||||
|
|
||||||
# Write the configuration to file
|
# Write the configuration to file
|
||||||
with open(config_path, "w", encoding="utf-8") as f:
|
with open(config_path, "w", encoding="utf-8") as f:
|
||||||
@@ -276,12 +157,12 @@ async def update_mcp_configuration(request: McpConfigUpdateRequest) -> McpConfig
|
|||||||
|
|
||||||
logger.info(f"MCP configuration updated and saved to: {config_path}")
|
logger.info(f"MCP configuration updated and saved to: {config_path}")
|
||||||
|
|
||||||
# Reload the Gateway configuration and update the global cache. The
|
# NOTE: No need to reload/reset cache here - LangGraph Server (separate process)
|
||||||
# agent runtime lives in Gateway, so this keeps API reads and tool
|
# will detect config file changes via mtime and reinitialize MCP tools automatically
|
||||||
# execution aligned after extensions_config.json changes.
|
|
||||||
|
# Reload the configuration and update the global cache
|
||||||
reloaded_config = reload_extensions_config()
|
reloaded_config = reload_extensions_config()
|
||||||
servers = {name: _mask_server_config(McpServerConfigResponse(**server.model_dump())) for name, server in reloaded_config.mcp_servers.items()}
|
return McpConfigResponse(mcp_servers={name: McpServerConfigResponse(**server.model_dump()) for name, server in reloaded_config.mcp_servers.items()})
|
||||||
return McpConfigResponse(mcp_servers=servers)
|
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"Failed to update MCP configuration: {e}", exc_info=True)
|
logger.error(f"Failed to update MCP configuration: {e}", exc_info=True)
|
||||||
|
|||||||
@@ -7,6 +7,7 @@ is reused so that conversation history is preserved across calls.
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
import logging
|
import logging
|
||||||
import uuid
|
import uuid
|
||||||
|
|
||||||
@@ -16,7 +17,7 @@ from fastapi.responses import StreamingResponse
|
|||||||
from app.gateway.authz import require_permission
|
from app.gateway.authz import require_permission
|
||||||
from app.gateway.deps import get_checkpointer, get_feedback_repo, get_run_event_store, get_run_manager, get_run_store, get_stream_bridge
|
from app.gateway.deps import get_checkpointer, get_feedback_repo, get_run_event_store, get_run_manager, get_run_store, get_stream_bridge
|
||||||
from app.gateway.routers.thread_runs import RunCreateRequest
|
from app.gateway.routers.thread_runs import RunCreateRequest
|
||||||
from app.gateway.services import sse_consumer, start_run, wait_for_run_completion
|
from app.gateway.services import sse_consumer, start_run
|
||||||
from deerflow.runtime import serialize_channel_values
|
from deerflow.runtime import serialize_channel_values
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
@@ -65,25 +66,24 @@ async def stateless_wait(body: RunCreateRequest, request: Request) -> dict:
|
|||||||
Otherwise a new temporary thread is created.
|
Otherwise a new temporary thread is created.
|
||||||
"""
|
"""
|
||||||
thread_id = _resolve_thread_id(body)
|
thread_id = _resolve_thread_id(body)
|
||||||
bridge = get_stream_bridge(request)
|
|
||||||
run_mgr = get_run_manager(request)
|
|
||||||
record = await start_run(body, thread_id, request)
|
record = await start_run(body, thread_id, request)
|
||||||
|
|
||||||
completed = True
|
|
||||||
if record.task is not None:
|
if record.task is not None:
|
||||||
completed = await wait_for_run_completion(bridge, record, request, run_mgr)
|
|
||||||
|
|
||||||
if completed:
|
|
||||||
checkpointer = get_checkpointer(request)
|
|
||||||
config = {"configurable": {"thread_id": thread_id}}
|
|
||||||
try:
|
try:
|
||||||
checkpoint_tuple = await checkpointer.aget_tuple(config)
|
await record.task
|
||||||
if checkpoint_tuple is not None:
|
except asyncio.CancelledError:
|
||||||
checkpoint = getattr(checkpoint_tuple, "checkpoint", {}) or {}
|
pass
|
||||||
channel_values = checkpoint.get("channel_values", {})
|
|
||||||
return serialize_channel_values(channel_values)
|
checkpointer = get_checkpointer(request)
|
||||||
except Exception:
|
config = {"configurable": {"thread_id": thread_id}}
|
||||||
logger.exception("Failed to fetch final state for run %s", record.run_id)
|
try:
|
||||||
|
checkpoint_tuple = await checkpointer.aget_tuple(config)
|
||||||
|
if checkpoint_tuple is not None:
|
||||||
|
checkpoint = getattr(checkpoint_tuple, "checkpoint", {}) or {}
|
||||||
|
channel_values = checkpoint.get("channel_values", {})
|
||||||
|
return serialize_channel_values(channel_values)
|
||||||
|
except Exception:
|
||||||
|
logger.exception("Failed to fetch final state for run %s", record.run_id)
|
||||||
|
|
||||||
return {"status": record.status.value, "error": record.error}
|
return {"status": record.status.value, "error": record.error}
|
||||||
|
|
||||||
|
|||||||
@@ -21,8 +21,8 @@ from pydantic import BaseModel, Field
|
|||||||
|
|
||||||
from app.gateway.authz import require_permission
|
from app.gateway.authz import require_permission
|
||||||
from app.gateway.deps import get_checkpointer, get_current_user, get_feedback_repo, get_run_event_store, get_run_manager, get_run_store, get_stream_bridge
|
from app.gateway.deps import get_checkpointer, get_current_user, get_feedback_repo, get_run_event_store, get_run_manager, get_run_store, get_stream_bridge
|
||||||
from app.gateway.services import sse_consumer, start_run, wait_for_run_completion
|
from app.gateway.services import sse_consumer, start_run
|
||||||
from deerflow.runtime import RunRecord, RunStatus, serialize_channel_values
|
from deerflow.runtime import RunRecord, serialize_channel_values
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
router = APIRouter(prefix="/api/threads", tags=["runs"])
|
router = APIRouter(prefix="/api/threads", tags=["runs"])
|
||||||
@@ -66,14 +66,6 @@ class RunResponse(BaseModel):
|
|||||||
multitask_strategy: str = "reject"
|
multitask_strategy: str = "reject"
|
||||||
created_at: str = ""
|
created_at: str = ""
|
||||||
updated_at: str = ""
|
updated_at: str = ""
|
||||||
total_input_tokens: int = 0
|
|
||||||
total_output_tokens: int = 0
|
|
||||||
total_tokens: int = 0
|
|
||||||
llm_call_count: int = 0
|
|
||||||
lead_agent_tokens: int = 0
|
|
||||||
subagent_tokens: int = 0
|
|
||||||
middleware_tokens: int = 0
|
|
||||||
message_count: int = 0
|
|
||||||
|
|
||||||
|
|
||||||
class ThreadTokenUsageModelBreakdown(BaseModel):
|
class ThreadTokenUsageModelBreakdown(BaseModel):
|
||||||
@@ -102,12 +94,6 @@ class ThreadTokenUsageResponse(BaseModel):
|
|||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
def _cancel_conflict_detail(run_id: str, record: RunRecord) -> str:
|
|
||||||
if record.status in (RunStatus.pending, RunStatus.running):
|
|
||||||
return f"Run {run_id} is not active on this worker and cannot be cancelled"
|
|
||||||
return f"Run {run_id} is not cancellable (status: {record.status.value})"
|
|
||||||
|
|
||||||
|
|
||||||
def _record_to_response(record: RunRecord) -> RunResponse:
|
def _record_to_response(record: RunRecord) -> RunResponse:
|
||||||
return RunResponse(
|
return RunResponse(
|
||||||
run_id=record.run_id,
|
run_id=record.run_id,
|
||||||
@@ -119,14 +105,6 @@ def _record_to_response(record: RunRecord) -> RunResponse:
|
|||||||
multitask_strategy=record.multitask_strategy,
|
multitask_strategy=record.multitask_strategy,
|
||||||
created_at=record.created_at,
|
created_at=record.created_at,
|
||||||
updated_at=record.updated_at,
|
updated_at=record.updated_at,
|
||||||
total_input_tokens=record.total_input_tokens,
|
|
||||||
total_output_tokens=record.total_output_tokens,
|
|
||||||
total_tokens=record.total_tokens,
|
|
||||||
llm_call_count=record.llm_call_count,
|
|
||||||
lead_agent_tokens=record.lead_agent_tokens,
|
|
||||||
subagent_tokens=record.subagent_tokens,
|
|
||||||
middleware_tokens=record.middleware_tokens,
|
|
||||||
message_count=record.message_count,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@@ -175,25 +153,24 @@ async def stream_run(thread_id: str, body: RunCreateRequest, request: Request) -
|
|||||||
@require_permission("runs", "create", owner_check=True, require_existing=True)
|
@require_permission("runs", "create", owner_check=True, require_existing=True)
|
||||||
async def wait_run(thread_id: str, body: RunCreateRequest, request: Request) -> dict:
|
async def wait_run(thread_id: str, body: RunCreateRequest, request: Request) -> dict:
|
||||||
"""Create a run and block until it completes, returning the final state."""
|
"""Create a run and block until it completes, returning the final state."""
|
||||||
bridge = get_stream_bridge(request)
|
|
||||||
run_mgr = get_run_manager(request)
|
|
||||||
record = await start_run(body, thread_id, request)
|
record = await start_run(body, thread_id, request)
|
||||||
|
|
||||||
completed = True
|
|
||||||
if record.task is not None:
|
if record.task is not None:
|
||||||
completed = await wait_for_run_completion(bridge, record, request, run_mgr)
|
|
||||||
|
|
||||||
if completed:
|
|
||||||
checkpointer = get_checkpointer(request)
|
|
||||||
config = {"configurable": {"thread_id": thread_id}}
|
|
||||||
try:
|
try:
|
||||||
checkpoint_tuple = await checkpointer.aget_tuple(config)
|
await record.task
|
||||||
if checkpoint_tuple is not None:
|
except asyncio.CancelledError:
|
||||||
checkpoint = getattr(checkpoint_tuple, "checkpoint", {}) or {}
|
pass
|
||||||
channel_values = checkpoint.get("channel_values", {})
|
|
||||||
return serialize_channel_values(channel_values)
|
checkpointer = get_checkpointer(request)
|
||||||
except Exception:
|
config = {"configurable": {"thread_id": thread_id}}
|
||||||
logger.exception("Failed to fetch final state for run %s", record.run_id)
|
try:
|
||||||
|
checkpoint_tuple = await checkpointer.aget_tuple(config)
|
||||||
|
if checkpoint_tuple is not None:
|
||||||
|
checkpoint = getattr(checkpoint_tuple, "checkpoint", {}) or {}
|
||||||
|
channel_values = checkpoint.get("channel_values", {})
|
||||||
|
return serialize_channel_values(channel_values)
|
||||||
|
except Exception:
|
||||||
|
logger.exception("Failed to fetch final state for run %s", record.run_id)
|
||||||
|
|
||||||
return {"status": record.status.value, "error": record.error}
|
return {"status": record.status.value, "error": record.error}
|
||||||
|
|
||||||
@@ -203,8 +180,7 @@ async def wait_run(thread_id: str, body: RunCreateRequest, request: Request) ->
|
|||||||
async def list_runs(thread_id: str, request: Request) -> list[RunResponse]:
|
async def list_runs(thread_id: str, request: Request) -> list[RunResponse]:
|
||||||
"""List all runs for a thread."""
|
"""List all runs for a thread."""
|
||||||
run_mgr = get_run_manager(request)
|
run_mgr = get_run_manager(request)
|
||||||
user_id = await get_current_user(request)
|
records = await run_mgr.list_by_thread(thread_id)
|
||||||
records = await run_mgr.list_by_thread(thread_id, user_id=user_id)
|
|
||||||
return [_record_to_response(r) for r in records]
|
return [_record_to_response(r) for r in records]
|
||||||
|
|
||||||
|
|
||||||
@@ -213,8 +189,7 @@ async def list_runs(thread_id: str, request: Request) -> list[RunResponse]:
|
|||||||
async def get_run(thread_id: str, run_id: str, request: Request) -> RunResponse:
|
async def get_run(thread_id: str, run_id: str, request: Request) -> RunResponse:
|
||||||
"""Get details of a specific run."""
|
"""Get details of a specific run."""
|
||||||
run_mgr = get_run_manager(request)
|
run_mgr = get_run_manager(request)
|
||||||
user_id = await get_current_user(request)
|
record = run_mgr.get(run_id)
|
||||||
record = await run_mgr.get(run_id, user_id=user_id)
|
|
||||||
if record is None or record.thread_id != thread_id:
|
if record is None or record.thread_id != thread_id:
|
||||||
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
||||||
return _record_to_response(record)
|
return _record_to_response(record)
|
||||||
@@ -237,13 +212,16 @@ async def cancel_run(
|
|||||||
- wait=false: Return immediately with 202
|
- wait=false: Return immediately with 202
|
||||||
"""
|
"""
|
||||||
run_mgr = get_run_manager(request)
|
run_mgr = get_run_manager(request)
|
||||||
record = await run_mgr.get(run_id)
|
record = run_mgr.get(run_id)
|
||||||
if record is None or record.thread_id != thread_id:
|
if record is None or record.thread_id != thread_id:
|
||||||
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
||||||
|
|
||||||
cancelled = await run_mgr.cancel(run_id, action=action)
|
cancelled = await run_mgr.cancel(run_id, action=action)
|
||||||
if not cancelled:
|
if not cancelled:
|
||||||
raise HTTPException(status_code=409, detail=_cancel_conflict_detail(run_id, record))
|
raise HTTPException(
|
||||||
|
status_code=409,
|
||||||
|
detail=f"Run {run_id} is not cancellable (status: {record.status.value})",
|
||||||
|
)
|
||||||
|
|
||||||
if wait and record.task is not None:
|
if wait and record.task is not None:
|
||||||
try:
|
try:
|
||||||
@@ -259,14 +237,12 @@ async def cancel_run(
|
|||||||
@require_permission("runs", "read", owner_check=True)
|
@require_permission("runs", "read", owner_check=True)
|
||||||
async def join_run(thread_id: str, run_id: str, request: Request) -> StreamingResponse:
|
async def join_run(thread_id: str, run_id: str, request: Request) -> StreamingResponse:
|
||||||
"""Join an existing run's SSE stream."""
|
"""Join an existing run's SSE stream."""
|
||||||
|
bridge = get_stream_bridge(request)
|
||||||
run_mgr = get_run_manager(request)
|
run_mgr = get_run_manager(request)
|
||||||
record = await run_mgr.get(run_id)
|
record = run_mgr.get(run_id)
|
||||||
if record is None or record.thread_id != thread_id:
|
if record is None or record.thread_id != thread_id:
|
||||||
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
||||||
if record.store_only:
|
|
||||||
raise HTTPException(status_code=409, detail=f"Run {run_id} is not active on this worker and cannot be streamed")
|
|
||||||
|
|
||||||
bridge = get_stream_bridge(request)
|
|
||||||
return StreamingResponse(
|
return StreamingResponse(
|
||||||
sse_consumer(bridge, record, request, run_mgr),
|
sse_consumer(bridge, record, request, run_mgr),
|
||||||
media_type="text/event-stream",
|
media_type="text/event-stream",
|
||||||
@@ -278,12 +254,7 @@ async def join_run(thread_id: str, run_id: str, request: Request) -> StreamingRe
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
# Register GET and POST as separate routes so each method gets a unique OpenAPI
|
@router.api_route("/{thread_id}/runs/{run_id}/stream", methods=["GET", "POST"], response_model=None)
|
||||||
# operationId. ``api_route(methods=["GET", "POST"])`` shares one route registration
|
|
||||||
# across both methods, which makes FastAPI emit the same ``operationId`` twice and
|
|
||||||
# warn about a duplicate operation id during OpenAPI generation.
|
|
||||||
@router.get("/{thread_id}/runs/{run_id}/stream", response_model=None)
|
|
||||||
@router.post("/{thread_id}/runs/{run_id}/stream", response_model=None)
|
|
||||||
@require_permission("runs", "read", owner_check=True)
|
@require_permission("runs", "read", owner_check=True)
|
||||||
async def stream_existing_run(
|
async def stream_existing_run(
|
||||||
thread_id: str,
|
thread_id: str,
|
||||||
@@ -300,18 +271,14 @@ async def stream_existing_run(
|
|||||||
remaining buffered events so the client observes a clean shutdown.
|
remaining buffered events so the client observes a clean shutdown.
|
||||||
"""
|
"""
|
||||||
run_mgr = get_run_manager(request)
|
run_mgr = get_run_manager(request)
|
||||||
record = await run_mgr.get(run_id)
|
record = run_mgr.get(run_id)
|
||||||
if record is None or record.thread_id != thread_id:
|
if record is None or record.thread_id != thread_id:
|
||||||
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
|
||||||
if record.store_only and action is None:
|
|
||||||
raise HTTPException(status_code=409, detail=f"Run {run_id} is not active on this worker and cannot be streamed")
|
|
||||||
|
|
||||||
# Cancel if an action was requested (stop-button / interrupt flow)
|
# Cancel if an action was requested (stop-button / interrupt flow)
|
||||||
if action is not None:
|
if action is not None:
|
||||||
cancelled = await run_mgr.cancel(run_id, action=action)
|
cancelled = await run_mgr.cancel(run_id, action=action)
|
||||||
if not cancelled:
|
if cancelled and wait and record.task is not None:
|
||||||
raise HTTPException(status_code=409, detail=_cancel_conflict_detail(run_id, record))
|
|
||||||
if wait and record.task is not None:
|
|
||||||
try:
|
try:
|
||||||
await record.task
|
await record.task
|
||||||
except (asyncio.CancelledError, Exception):
|
except (asyncio.CancelledError, Exception):
|
||||||
@@ -424,15 +391,8 @@ async def list_run_events(
|
|||||||
|
|
||||||
@router.get("/{thread_id}/token-usage", response_model=ThreadTokenUsageResponse)
|
@router.get("/{thread_id}/token-usage", response_model=ThreadTokenUsageResponse)
|
||||||
@require_permission("threads", "read", owner_check=True)
|
@require_permission("threads", "read", owner_check=True)
|
||||||
async def thread_token_usage(
|
async def thread_token_usage(thread_id: str, request: Request) -> ThreadTokenUsageResponse:
|
||||||
thread_id: str,
|
|
||||||
request: Request,
|
|
||||||
include_active: bool = Query(default=False, description="Include running run progress snapshots"),
|
|
||||||
) -> ThreadTokenUsageResponse:
|
|
||||||
"""Thread-level token usage aggregation."""
|
"""Thread-level token usage aggregation."""
|
||||||
run_store = get_run_store(request)
|
run_store = get_run_store(request)
|
||||||
if include_active:
|
agg = await run_store.aggregate_tokens_by_thread(thread_id)
|
||||||
agg = await run_store.aggregate_tokens_by_thread(thread_id, include_active=True)
|
|
||||||
else:
|
|
||||||
agg = await run_store.aggregate_tokens_by_thread(thread_id)
|
|
||||||
return ThreadTokenUsageResponse(thread_id=thread_id, **agg)
|
return ThreadTokenUsageResponse(thread_id=thread_id, **agg)
|
||||||
|
|||||||
@@ -69,30 +69,11 @@ def _make_file_sandbox_writable(file_path: os.PathLike[str] | str) -> None:
|
|||||||
logger.warning("Skipping sandbox chmod for symlinked upload path: %s", file_path)
|
logger.warning("Skipping sandbox chmod for symlinked upload path: %s", file_path)
|
||||||
return
|
return
|
||||||
|
|
||||||
writable_mode = stat.S_IMODE(file_stat.st_mode) | stat.S_IWUSR | stat.S_IWGRP | stat.S_IWOTH | stat.S_IRGRP | stat.S_IROTH
|
writable_mode = stat.S_IMODE(file_stat.st_mode) | stat.S_IWUSR | stat.S_IWGRP | stat.S_IWOTH
|
||||||
chmod_kwargs = {"follow_symlinks": False} if os.chmod in os.supports_follow_symlinks else {}
|
chmod_kwargs = {"follow_symlinks": False} if os.chmod in os.supports_follow_symlinks else {}
|
||||||
os.chmod(file_path, writable_mode, **chmod_kwargs)
|
os.chmod(file_path, writable_mode, **chmod_kwargs)
|
||||||
|
|
||||||
|
|
||||||
def _make_file_sandbox_readable(file_path: os.PathLike[str] | str) -> None:
|
|
||||||
"""Ensure uploaded files are readable by the sandbox process.
|
|
||||||
|
|
||||||
For Docker sandboxes (AIO), the gateway writes files as root with 0o600
|
|
||||||
permissions, then bind-mounts the host directory into the container. The
|
|
||||||
sandbox process inside the container runs as a non-root user and cannot
|
|
||||||
read those files without group/other read bits. This function adds
|
|
||||||
``S_IRGRP | S_IROTH`` so the sandbox can read the uploaded content.
|
|
||||||
"""
|
|
||||||
file_stat = os.lstat(file_path)
|
|
||||||
if stat.S_ISLNK(file_stat.st_mode):
|
|
||||||
logger.warning("Skipping sandbox chmod for symlinked upload path: %s", file_path)
|
|
||||||
return
|
|
||||||
|
|
||||||
readable_mode = stat.S_IMODE(file_stat.st_mode) | stat.S_IRGRP | stat.S_IROTH
|
|
||||||
chmod_kwargs = {"follow_symlinks": False} if os.chmod in os.supports_follow_symlinks else {}
|
|
||||||
os.chmod(file_path, readable_mode, **chmod_kwargs)
|
|
||||||
|
|
||||||
|
|
||||||
def _uses_thread_data_mounts(sandbox_provider: SandboxProvider) -> bool:
|
def _uses_thread_data_mounts(sandbox_provider: SandboxProvider) -> bool:
|
||||||
return bool(getattr(sandbox_provider, "uses_thread_data_mounts", False))
|
return bool(getattr(sandbox_provider, "uses_thread_data_mounts", False))
|
||||||
|
|
||||||
@@ -295,16 +276,6 @@ async def upload_files(
|
|||||||
_cleanup_uploaded_paths(written_paths)
|
_cleanup_uploaded_paths(written_paths)
|
||||||
raise HTTPException(status_code=500, detail=f"Failed to upload {file.filename}: {str(e)}")
|
raise HTTPException(status_code=500, detail=f"Failed to upload {file.filename}: {str(e)}")
|
||||||
|
|
||||||
# Uploaded files are created with 0o600 permissions (owner read/write only).
|
|
||||||
# In Docker sandbox deployments the gateway writes as root but the sandbox
|
|
||||||
# process runs as a non-root user (typically UID 1000). Without group/other
|
|
||||||
# read bits the sandbox cannot access the files — whether the uploads
|
|
||||||
# directory is bind-mounted into the container or synced via
|
|
||||||
# sandbox.update_file. Always add group/other read bits so every sandbox
|
|
||||||
# configuration can read the uploaded content.
|
|
||||||
for file_path in written_paths:
|
|
||||||
_make_file_sandbox_readable(file_path)
|
|
||||||
|
|
||||||
if sync_to_sandbox:
|
if sync_to_sandbox:
|
||||||
for file_path, virtual_path in sandbox_sync_targets:
|
for file_path, virtual_path in sandbox_sync_targets:
|
||||||
_make_file_sandbox_writable(file_path)
|
_make_file_sandbox_writable(file_path)
|
||||||
|
|||||||
@@ -15,8 +15,7 @@ from collections.abc import Mapping
|
|||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
from fastapi import HTTPException, Request
|
from fastapi import HTTPException, Request
|
||||||
from langchain_core.messages import BaseMessage
|
from langchain_core.messages import HumanMessage
|
||||||
from langchain_core.messages.utils import convert_to_messages
|
|
||||||
|
|
||||||
from app.gateway.deps import get_run_context, get_run_manager, get_stream_bridge
|
from app.gateway.deps import get_run_context, get_run_manager, get_stream_bridge
|
||||||
from app.gateway.utils import sanitize_log_param
|
from app.gateway.utils import sanitize_log_param
|
||||||
@@ -33,7 +32,6 @@ from deerflow.runtime import (
|
|||||||
UnsupportedStrategyError,
|
UnsupportedStrategyError,
|
||||||
run_agent,
|
run_agent,
|
||||||
)
|
)
|
||||||
from deerflow.runtime.runs.naming import resolve_root_run_name
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
@@ -77,35 +75,21 @@ def normalize_stream_modes(raw: list[str] | str | None) -> list[str]:
|
|||||||
|
|
||||||
|
|
||||||
def normalize_input(raw_input: dict[str, Any] | None) -> dict[str, Any]:
|
def normalize_input(raw_input: dict[str, Any] | None) -> dict[str, Any]:
|
||||||
"""Convert LangGraph Platform input format to LangChain state dict.
|
"""Convert LangGraph Platform input format to LangChain state dict."""
|
||||||
|
|
||||||
Delegates dict→message coercion to ``langchain_core.messages.utils.convert_to_messages``
|
|
||||||
so that ``additional_kwargs`` (e.g. uploaded-file metadata — gh #3132), ``id``,
|
|
||||||
``name``, and non-human roles (ai/system/tool) survive unchanged. An earlier
|
|
||||||
hand-rolled version only forwarded ``content`` and collapsed every role to
|
|
||||||
``HumanMessage``, which silently stripped frontend-supplied attachments.
|
|
||||||
|
|
||||||
Malformed message dicts (missing ``role``/``type``/``content``, unsupported
|
|
||||||
role, etc.) raise ``HTTPException(400)`` with the offending index, instead
|
|
||||||
of bubbling up as a 500. The gateway is a system boundary, so per-entry
|
|
||||||
validation errors are the right shape for clients to retry against.
|
|
||||||
"""
|
|
||||||
if raw_input is None:
|
if raw_input is None:
|
||||||
return {}
|
return {}
|
||||||
messages = raw_input.get("messages")
|
messages = raw_input.get("messages")
|
||||||
if messages and isinstance(messages, list):
|
if messages and isinstance(messages, list):
|
||||||
converted: list[Any] = []
|
converted = []
|
||||||
for index, msg in enumerate(messages):
|
for msg in messages:
|
||||||
if isinstance(msg, BaseMessage):
|
if isinstance(msg, dict):
|
||||||
converted.append(msg)
|
role = msg.get("role", msg.get("type", "user"))
|
||||||
elif isinstance(msg, dict):
|
content = msg.get("content", "")
|
||||||
try:
|
if role in ("user", "human"):
|
||||||
converted.extend(convert_to_messages([msg]))
|
converted.append(HumanMessage(content=content))
|
||||||
except (ValueError, TypeError, NotImplementedError) as exc:
|
else:
|
||||||
raise HTTPException(
|
# TODO: handle other message types (system, ai, tool)
|
||||||
status_code=400,
|
converted.append(HumanMessage(content=content))
|
||||||
detail=f"Invalid message at input.messages[{index}]: {exc}",
|
|
||||||
) from exc
|
|
||||||
else:
|
else:
|
||||||
converted.append(msg)
|
converted.append(msg)
|
||||||
return {**raw_input, "messages": converted}
|
return {**raw_input, "messages": converted}
|
||||||
@@ -251,7 +235,6 @@ def build_run_config(
|
|||||||
target = config.setdefault("configurable", {})
|
target = config.setdefault("configurable", {})
|
||||||
if target is not None and "agent_name" not in target:
|
if target is not None and "agent_name" not in target:
|
||||||
target["agent_name"] = normalized
|
target["agent_name"] = normalized
|
||||||
config.setdefault("run_name", resolve_root_run_name(config, normalized))
|
|
||||||
if metadata:
|
if metadata:
|
||||||
config.setdefault("metadata", {}).update(metadata)
|
config.setdefault("metadata", {}).update(metadata)
|
||||||
return config
|
return config
|
||||||
@@ -402,51 +385,3 @@ async def sse_consumer(
|
|||||||
if record.status in (RunStatus.pending, RunStatus.running):
|
if record.status in (RunStatus.pending, RunStatus.running):
|
||||||
if record.on_disconnect == DisconnectMode.cancel:
|
if record.on_disconnect == DisconnectMode.cancel:
|
||||||
await run_mgr.cancel(record.run_id)
|
await run_mgr.cancel(record.run_id)
|
||||||
|
|
||||||
|
|
||||||
async def wait_for_run_completion(
|
|
||||||
bridge: StreamBridge,
|
|
||||||
record: RunRecord,
|
|
||||||
request: Request,
|
|
||||||
run_mgr: RunManager,
|
|
||||||
) -> bool:
|
|
||||||
"""Block until the run publishes ``END_SENTINEL``, honouring on_disconnect.
|
|
||||||
|
|
||||||
The non-streaming ``/wait`` endpoints used to ``await record.task``
|
|
||||||
directly with no disconnect handling. When the client (or an
|
|
||||||
intermediate HTTP proxy) timed out during a long tool call such as
|
|
||||||
``pip install``, the handler would swallow ``CancelledError`` and
|
|
||||||
serialize whatever checkpoint happened to exist — masking a half-finished
|
|
||||||
run as a normal completion (issue #3265).
|
|
||||||
|
|
||||||
This helper consumes the same bridge that ``sse_consumer`` does so the
|
|
||||||
wait path shares its disconnect semantics: each wake-up polls
|
|
||||||
``request.is_disconnected()``; on a real disconnect it cancels the
|
|
||||||
background run when ``record.on_disconnect`` is ``cancel``. The bridge's
|
|
||||||
heartbeat sentinels guarantee at least one wake-up per
|
|
||||||
``heartbeat_interval`` even when the agent emits no events for a while.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
``True`` when ``END_SENTINEL`` was observed (run reached a terminal
|
|
||||||
state), ``False`` when the loop exited because the client
|
|
||||||
disconnected. Callers must skip checkpoint serialization on
|
|
||||||
``False`` so a partial checkpoint is not returned as a normal
|
|
||||||
response.
|
|
||||||
"""
|
|
||||||
completed = False
|
|
||||||
try:
|
|
||||||
async for entry in bridge.subscribe(record.run_id):
|
|
||||||
# END_SENTINEL means the run reached a terminal state; honour it
|
|
||||||
# even if the client just disconnected so the caller still serializes
|
|
||||||
# the real final checkpoint.
|
|
||||||
if entry is END_SENTINEL:
|
|
||||||
completed = True
|
|
||||||
return True
|
|
||||||
if await request.is_disconnected():
|
|
||||||
break
|
|
||||||
# Heartbeats and regular events: keep waiting for END_SENTINEL.
|
|
||||||
return completed
|
|
||||||
finally:
|
|
||||||
if not completed and record.status in (RunStatus.pending, RunStatus.running):
|
|
||||||
if record.on_disconnect == DisconnectMode.cancel:
|
|
||||||
await run_mgr.cancel(record.run_id)
|
|
||||||
|
|||||||
@@ -241,6 +241,13 @@ GET /api/mcp/config
|
|||||||
"GITHUB_TOKEN": "***"
|
"GITHUB_TOKEN": "***"
|
||||||
},
|
},
|
||||||
"description": "GitHub operations"
|
"description": "GitHub operations"
|
||||||
|
},
|
||||||
|
"filesystem": {
|
||||||
|
"enabled": false,
|
||||||
|
"type": "stdio",
|
||||||
|
"command": "npx",
|
||||||
|
"args": ["-y", "@modelcontextprotocol/server-filesystem"],
|
||||||
|
"description": "File system access"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -4,12 +4,10 @@
|
|||||||
|
|
||||||
| 模式 | 启动命令 | Auth 层 | 端口 |
|
| 模式 | 启动命令 | Auth 层 | 端口 |
|
||||||
|------|---------|---------|------|
|
|------|---------|---------|------|
|
||||||
| 标准模式 | `make dev` | Gateway AuthMiddleware(全量) | 2026 (nginx) |
|
| 标准模式 | `make dev` | Gateway AuthMiddleware + LangGraph auth | 2026 (nginx) |
|
||||||
|
| Gateway 模式 | `make dev-pro` | Gateway AuthMiddleware(全量) | 2026 (nginx) |
|
||||||
| 直连 Gateway | `cd backend && make gateway` | Gateway AuthMiddleware | 8001 |
|
| 直连 Gateway | `cd backend && make gateway` | Gateway AuthMiddleware | 8001 |
|
||||||
| 直连 LangGraph 兼容性 | 手动运行 LangGraph 工具链时使用 | LangGraph auth | 2024 |
|
| 直连 LangGraph | `cd backend && make dev` | LangGraph auth | 2024 |
|
||||||
|
|
||||||
`make dev`、Docker dev 和生产部署默认都运行 Gateway embedded runtime。
|
|
||||||
`app.gateway.langgraph_auth` 仅用于保留的直连 LangGraph 工具链 / Studio 兼容性测试,不是标准服务启动路径。
|
|
||||||
|
|
||||||
每种模式下都需执行以下测试。
|
每种模式下都需执行以下测试。
|
||||||
|
|
||||||
@@ -23,8 +21,10 @@
|
|||||||
# 清除已有数据
|
# 清除已有数据
|
||||||
rm -f backend/.deer-flow/data/deerflow.db
|
rm -f backend/.deer-flow/data/deerflow.db
|
||||||
|
|
||||||
# 启动标准模式(Gateway embedded runtime)
|
# 选择模式启动
|
||||||
make dev
|
make dev # 标准模式
|
||||||
|
# 或
|
||||||
|
make dev-pro # Gateway 模式
|
||||||
```
|
```
|
||||||
|
|
||||||
**验证点:**
|
**验证点:**
|
||||||
@@ -57,7 +57,7 @@ make dev
|
|||||||
|
|
||||||
## 二、接口流程测试
|
## 二、接口流程测试
|
||||||
|
|
||||||
> 以下用 `BASE=http://localhost:2026` 为例。标准模式经 nginx 暴露此地址。
|
> 以下用 `BASE=http://localhost:2026` 为例。标准模式和 Gateway 模式都用此地址。
|
||||||
> 直连测试替换为对应端口。
|
> 直连测试替换为对应端口。
|
||||||
>
|
>
|
||||||
> **CSRF token 提取**:多处用到从 cookie jar 提取 CSRF token,统一使用:
|
> **CSRF token 提取**:多处用到从 cookie jar 提取 CSRF token,统一使用:
|
||||||
@@ -211,18 +211,20 @@ curl -s -X POST $BASE/api/threads/search \
|
|||||||
|
|
||||||
**预期:** 返回 0 或仅包含 user2 自己的 thread
|
**预期:** 返回 0 或仅包含 user2 自己的 thread
|
||||||
|
|
||||||
### 2.3 LangGraph-compatible Gateway 路由隔离
|
### 2.3 标准模式 LangGraph Server 隔离
|
||||||
|
|
||||||
#### TC-API-10: LangGraph-compatible 端点需要 cookie
|
> 仅在标准模式下测试。Gateway 模式不跑 LangGraph Server。
|
||||||
|
|
||||||
|
#### TC-API-10: LangGraph 端点需要 cookie
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 不带 cookie 访问 LangGraph-compatible 接口
|
# 不带 cookie 访问 LangGraph 接口
|
||||||
curl -s -w "%{http_code}" $BASE/api/langgraph/threads
|
curl -s -w "%{http_code}" $BASE/api/langgraph/threads
|
||||||
```
|
```
|
||||||
|
|
||||||
**预期:** 401
|
**预期:** 401
|
||||||
|
|
||||||
#### TC-API-11: LangGraph-compatible 路由带 cookie 可访问
|
#### TC-API-11: LangGraph 带 cookie 可访问
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
curl -s $BASE/api/langgraph/threads -b user1.txt | jq length
|
curl -s $BASE/api/langgraph/threads -b user1.txt | jq length
|
||||||
@@ -230,10 +232,10 @@ curl -s $BASE/api/langgraph/threads -b user1.txt | jq length
|
|||||||
|
|
||||||
**预期:** 200,返回 user1 的 thread 列表
|
**预期:** 200,返回 user1 的 thread 列表
|
||||||
|
|
||||||
#### TC-API-12: LangGraph-compatible 路由隔离 — 用户只看到自己的
|
#### TC-API-12: LangGraph 隔离 — 用户只看到自己的
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# user2 查 threads
|
# user2 查 LangGraph threads
|
||||||
curl -s $BASE/api/langgraph/threads -b user2.txt | jq length
|
curl -s $BASE/api/langgraph/threads -b user2.txt | jq length
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -1232,11 +1234,21 @@ P2=$(awk -F': ' '/^password:/ {print $2}' /tmp/deerflow-reset-p2.txt)
|
|||||||
## 七、模式差异测试
|
## 七、模式差异测试
|
||||||
|
|
||||||
> 以下用 `GW=http://localhost:8001` 表示直连 Gateway,`BASE=http://localhost:2026` 表示经 nginx。
|
> 以下用 `GW=http://localhost:8001` 表示直连 Gateway,`BASE=http://localhost:2026` 表示经 nginx。
|
||||||
> 标准启动命令:`make dev`(或 `./scripts/serve.sh --dev`)。
|
> Gateway 模式启动命令:`make dev-pro`(或 `./scripts/serve.sh --dev --gateway`)。
|
||||||
|
|
||||||
### 7.1 标准启动模式
|
### 7.1 标准模式独有
|
||||||
|
|
||||||
#### TC-MODE-01: Gateway AuthMiddleware 的 token_version 检查
|
> 启动命令:`make dev`(或 `./scripts/serve.sh --dev`)
|
||||||
|
|
||||||
|
#### TC-MODE-01: LangGraph Server 独立运行,需 cookie
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 无 cookie 访问 LangGraph
|
||||||
|
curl -s -w "%{http_code}" -o /dev/null $BASE/api/langgraph/threads/search
|
||||||
|
# 预期: 403(LangGraph auth handler 拒绝)
|
||||||
|
```
|
||||||
|
|
||||||
|
#### TC-MODE-02: LangGraph auth 的 token_version 检查
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 登录拿 cookie
|
# 登录拿 cookie
|
||||||
@@ -1249,9 +1261,9 @@ curl -s -X POST $BASE/api/v1/auth/change-password \
|
|||||||
-b cookies.txt -H "Content-Type: application/json" -H "X-CSRF-Token: $CSRF" \
|
-b cookies.txt -H "Content-Type: application/json" -H "X-CSRF-Token: $CSRF" \
|
||||||
-d '{"current_password":"正确密码","new_password":"NewPass1!"}' -c new_cookies.txt
|
-d '{"current_password":"正确密码","new_password":"NewPass1!"}' -c new_cookies.txt
|
||||||
|
|
||||||
# 用旧 cookie 访问 LangGraph-compatible 路由
|
# 用旧 cookie 访问 LangGraph
|
||||||
curl -s -w "%{http_code}" $BASE/api/langgraph/threads/search -b cookies.txt
|
curl -s -w "%{http_code}" $BASE/api/langgraph/threads/search -b cookies.txt
|
||||||
# 预期: 401(token_version 不匹配)
|
# 预期: 403(token_version 不匹配)
|
||||||
|
|
||||||
# 用新 cookie 访问
|
# 用新 cookie 访问
|
||||||
CSRF2=$(grep csrf_token new_cookies.txt | awk '{print $NF}')
|
CSRF2=$(grep csrf_token new_cookies.txt | awk '{print $NF}')
|
||||||
@@ -1260,7 +1272,7 @@ curl -s -w "%{http_code}" -X POST $BASE/api/langgraph/threads/search \
|
|||||||
# 预期: 200
|
# 预期: 200
|
||||||
```
|
```
|
||||||
|
|
||||||
#### TC-MODE-02: Gateway owner filter 隔离
|
#### TC-MODE-03: LangGraph auth 的 owner filter 隔离
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# user1 创建 thread
|
# user1 创建 thread
|
||||||
@@ -1285,9 +1297,18 @@ print('OK: user2 sees', len(threads), 'threads, none belong to user1')
|
|||||||
"
|
"
|
||||||
```
|
```
|
||||||
|
|
||||||
#### TC-MODE-03: 所有请求经 AuthMiddleware
|
### 7.2 Gateway 模式独有
|
||||||
|
|
||||||
|
> 启动命令:`make dev-pro`(或 `./scripts/serve.sh --dev --gateway`)
|
||||||
|
> 无 LangGraph Server 进程,agent runtime 嵌入 Gateway。
|
||||||
|
|
||||||
|
#### TC-MODE-04: 所有请求经 AuthMiddleware
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
# 确认 LangGraph Server 未运行
|
||||||
|
curl -s -w "%{http_code}" -o /dev/null http://localhost:2024/ok
|
||||||
|
# 预期: 000(连接被拒)
|
||||||
|
|
||||||
# Gateway API 受保护
|
# Gateway API 受保护
|
||||||
curl -s -w "%{http_code}" -o /dev/null $BASE/api/models
|
curl -s -w "%{http_code}" -o /dev/null $BASE/api/models
|
||||||
# 预期: 401
|
# 预期: 401
|
||||||
@@ -1298,7 +1319,7 @@ curl -s -w "%{http_code}" -o /dev/null -X POST $BASE/api/langgraph/threads/searc
|
|||||||
# 预期: 401
|
# 预期: 401
|
||||||
```
|
```
|
||||||
|
|
||||||
#### TC-MODE-04: 标准模式下完整 auth 流程
|
#### TC-MODE-05: Gateway 模式下完整 auth 流程
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 登录
|
# 登录
|
||||||
@@ -1313,7 +1334,7 @@ curl -s -X POST $BASE/api/langgraph/threads \
|
|||||||
-d '{"metadata":{}}' | python3 -c "import sys,json; print(json.load(sys.stdin)['thread_id'])"
|
-d '{"metadata":{}}' | python3 -c "import sys,json; print(json.load(sys.stdin)['thread_id'])"
|
||||||
# 预期: 返回 thread_id
|
# 预期: 返回 thread_id
|
||||||
|
|
||||||
# CSRF 保护(CSRFMiddleware 覆盖所有 Gateway 路由)
|
# CSRF 保护(Gateway 模式下 CSRFMiddleware 直接覆盖所有路由)
|
||||||
curl -s -w "%{http_code}" -o /dev/null -X POST $BASE/api/langgraph/threads \
|
curl -s -w "%{http_code}" -o /dev/null -X POST $BASE/api/langgraph/threads \
|
||||||
-b cookies.txt -H "Content-Type: application/json" -d '{"metadata":{}}'
|
-b cookies.txt -H "Content-Type: application/json" -d '{"metadata":{}}'
|
||||||
# 预期: 403(CSRF token missing)
|
# 预期: 403(CSRF token missing)
|
||||||
@@ -1412,7 +1433,7 @@ done
|
|||||||
|
|
||||||
### 7.4 Docker 部署
|
### 7.4 Docker 部署
|
||||||
|
|
||||||
> 启动命令:`./scripts/deploy.sh`
|
> 启动命令:`./scripts/deploy.sh`(标准)或 `./scripts/deploy.sh --gateway`(Gateway 模式)
|
||||||
> Docker Compose 文件:`docker/docker-compose.yaml`
|
> Docker Compose 文件:`docker/docker-compose.yaml`
|
||||||
>
|
>
|
||||||
> 前置条件:
|
> 前置条件:
|
||||||
@@ -1521,16 +1542,16 @@ docker logs deer-flow-gateway 2>&1 | grep -iE "Password: .{15,}" && echo "FAIL:
|
|||||||
- 容器日志输出**路径**(不是密码本身),符合 CodeQL `py/clear-text-logging-sensitive-data` 规则
|
- 容器日志输出**路径**(不是密码本身),符合 CodeQL `py/clear-text-logging-sensitive-data` 规则
|
||||||
- `grep "Password:"` 在日志中**应当无匹配**(旧行为已废弃,simplify pass 移除了日志泄露路径)
|
- `grep "Password:"` 在日志中**应当无匹配**(旧行为已废弃,simplify pass 移除了日志泄露路径)
|
||||||
|
|
||||||
#### TC-DOCKER-06: Docker 部署
|
#### TC-DOCKER-06: Gateway 模式 Docker 部署
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 标准 Docker 模式:runtime 嵌入 gateway 容器
|
# Gateway 模式:无 langgraph 容器
|
||||||
./scripts/deploy.sh
|
./scripts/deploy.sh --gateway
|
||||||
sleep 15
|
sleep 15
|
||||||
|
|
||||||
# 确认 gateway 容器存在
|
# 确认 langgraph 容器不存在
|
||||||
docker ps --filter name=deer-flow-gateway --format '{{.Names}}'
|
docker ps --filter name=deer-flow-langgraph --format '{{.Names}}' | wc -l
|
||||||
# 预期: deer-flow-gateway
|
# 预期: 0
|
||||||
|
|
||||||
# auth 流程正常:未登录受保护接口返回 401
|
# auth 流程正常:未登录受保护接口返回 401
|
||||||
curl -s -w "%{http_code}" -o /dev/null $BASE/api/models
|
curl -s -w "%{http_code}" -o /dev/null $BASE/api/models
|
||||||
|
|||||||
@@ -99,7 +99,7 @@ rm -f backend/.deer-flow/data/deerflow.db
|
|||||||
| `.deer-flow/users/{user_id}/memory.json` | 用户级 memory |
|
| `.deer-flow/users/{user_id}/memory.json` | 用户级 memory |
|
||||||
| `.deer-flow/users/{user_id}/agents/{agent_name}/` | 用户自定义 agent 配置、SOUL 和 agent memory |
|
| `.deer-flow/users/{user_id}/agents/{agent_name}/` | 用户自定义 agent 配置、SOUL 和 agent memory |
|
||||||
| `.deer-flow/admin_initial_credentials.txt` | `reset_admin` 生成的新凭据文件(0600,读完应删除) |
|
| `.deer-flow/admin_initial_credentials.txt` | `reset_admin` 生成的新凭据文件(0600,读完应删除) |
|
||||||
| `.env` 中的 `AUTH_JWT_SECRET` | JWT 签名密钥(未设置时自动生成并持久化到 `.deer-flow/.jwt_secret`,重启后 session 保持) |
|
| `.env` 中的 `AUTH_JWT_SECRET` | JWT 签名密钥(未设置时自动生成临时密钥,重启后 session 失效) |
|
||||||
|
|
||||||
### 生产环境建议
|
### 生产环境建议
|
||||||
|
|
||||||
@@ -137,4 +137,4 @@ python -c "import secrets; print(secrets.token_urlsafe(32))"
|
|||||||
| 启动后没看到密码 | 当前实现不在启动日志输出密码 | 首次安装访问 `/setup`;忘记密码用 `reset_admin` |
|
| 启动后没看到密码 | 当前实现不在启动日志输出密码 | 首次安装访问 `/setup`;忘记密码用 `reset_admin` |
|
||||||
| `/login` 自动跳到 `/setup` | 系统还没有 admin | 在 `/setup` 创建第一个 admin |
|
| `/login` 自动跳到 `/setup` | 系统还没有 admin | 在 `/setup` 创建第一个 admin |
|
||||||
| 登录后 POST 返回 403 | CSRF token 缺失 | 确认前端已更新 |
|
| 登录后 POST 返回 403 | CSRF token 缺失 | 确认前端已更新 |
|
||||||
| 重启后需要重新登录 | `.jwt_secret` 文件被删除且 `.env` 未设置 `AUTH_JWT_SECRET` | 在 `.env` 中设置固定密钥 |
|
| 重启后需要重新登录 | `AUTH_JWT_SECRET` 未持久化 | 在 `.env` 中设置固定密钥 |
|
||||||
|
|||||||
@@ -1,154 +0,0 @@
|
|||||||
# Blocking IO detection usage and maintenance
|
|
||||||
|
|
||||||
This document describes how to use and maintain DeerFlow backend blocking-IO
|
|
||||||
detection for async event-loop safety.
|
|
||||||
|
|
||||||
The goal is narrow: find and prevent synchronous IO from blocking backend
|
|
||||||
async event-loop paths. Static and runtime detection are complementary, but
|
|
||||||
they have different jobs.
|
|
||||||
|
|
||||||
## Static detector
|
|
||||||
|
|
||||||
The static detector is the discovery tool. It scans backend source code and
|
|
||||||
reports candidate blocking-IO call sites that may need human review.
|
|
||||||
|
|
||||||
Run it from the repository root:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
make detect-blocking-io
|
|
||||||
```
|
|
||||||
|
|
||||||
Or from `backend/`:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
make detect-blocking-io
|
|
||||||
```
|
|
||||||
|
|
||||||
The report is written to:
|
|
||||||
|
|
||||||
```text
|
|
||||||
.deer-flow/blocking-io-findings.json
|
|
||||||
```
|
|
||||||
|
|
||||||
Use this output for review and triage. A static finding is a candidate, not
|
|
||||||
proof that production blocks the event loop at runtime. The current static
|
|
||||||
rules are intentionally broad; prefer triaging existing output before adding
|
|
||||||
new static rules.
|
|
||||||
|
|
||||||
Add a static rule only when review finds a recurring high-risk blocking
|
|
||||||
pattern that is invisible to the current detector.
|
|
||||||
|
|
||||||
## Runtime detector
|
|
||||||
|
|
||||||
The runtime detector is the CI regression guard. It uses Blockbuster to fail a
|
|
||||||
focused test when code under `app.*` or `deerflow.*` performs blocking IO on
|
|
||||||
the asyncio event-loop thread.
|
|
||||||
|
|
||||||
Run it from `backend/`:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
make test-blocking-io
|
|
||||||
```
|
|
||||||
|
|
||||||
The runtime gate starts from confirmed production bugs and protects those
|
|
||||||
paths from regressing. It does not prove that the entire backend is free of
|
|
||||||
blocking IO; it only covers the production paths exercised by
|
|
||||||
`backend/tests/blocking_io/`.
|
|
||||||
|
|
||||||
## Maintenance workflow
|
|
||||||
|
|
||||||
Use the static detector to find candidates, then use review to decide which
|
|
||||||
async production paths are worth protecting in CI.
|
|
||||||
|
|
||||||
The normal workflow is:
|
|
||||||
|
|
||||||
1. Run the static detector to find backend blocking-IO candidates.
|
|
||||||
2. Use human review to pick high-risk production async paths.
|
|
||||||
3. Add or update a focused runtime anchor in `backend/tests/blocking_io/`.
|
|
||||||
4. Let CI prevent that path from regressing.
|
|
||||||
|
|
||||||
Runtime detection has two maintenance paths.
|
|
||||||
|
|
||||||
### Add a runtime rule
|
|
||||||
|
|
||||||
Add a runtime rule when Blockbuster's default rules do not cover a generic
|
|
||||||
blocking primitive used by production code.
|
|
||||||
|
|
||||||
Rules belong in:
|
|
||||||
|
|
||||||
```text
|
|
||||||
backend/tests/support/detectors/blocking_io_runtime.py
|
|
||||||
```
|
|
||||||
|
|
||||||
Add them to `_PROJECT_BLOCKING_RULES`, not directly inside individual tests.
|
|
||||||
Keeping rules centralized makes it clear which extra primitives DeerFlow
|
|
||||||
expects Blockbuster to catch.
|
|
||||||
|
|
||||||
Example shape:
|
|
||||||
|
|
||||||
```python
|
|
||||||
import subprocess
|
|
||||||
|
|
||||||
from blockbuster import BlockBusterFunction
|
|
||||||
|
|
||||||
_PROJECT_BLOCKING_RULES = (
|
|
||||||
(
|
|
||||||
"subprocess.Popen.__init__",
|
|
||||||
BlockBusterFunction(
|
|
||||||
subprocess.Popen,
|
|
||||||
"__init__",
|
|
||||||
scanned_modules=["app", "deerflow"],
|
|
||||||
),
|
|
||||||
),
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
Do not add a runtime rule just because a business path is not tested. A rule
|
|
||||||
only expands what Blockbuster can intercept after code runs.
|
|
||||||
|
|
||||||
### Add a runtime anchor
|
|
||||||
|
|
||||||
Add a runtime anchor when a high-risk async production path should be protected
|
|
||||||
by CI but no existing `backend/tests/blocking_io/` test executes it.
|
|
||||||
|
|
||||||
Anchors belong in:
|
|
||||||
|
|
||||||
```text
|
|
||||||
backend/tests/blocking_io/
|
|
||||||
```
|
|
||||||
|
|
||||||
A good anchor should:
|
|
||||||
|
|
||||||
- Call the real production async entry point.
|
|
||||||
- Avoid bypassing the blocking surface with test-only `asyncio.to_thread`
|
|
||||||
wrappers.
|
|
||||||
- Use real local filesystem inputs when the bug shape is filesystem IO.
|
|
||||||
- Mock only the external dependency boundary, such as a network service or
|
|
||||||
third-party saver class.
|
|
||||||
- Fail if a future change moves the blocking operation back onto the event
|
|
||||||
loop.
|
|
||||||
|
|
||||||
Avoid testing only the low-level helper unless that helper is the production
|
|
||||||
async entry point. The runtime gate is most useful when it protects the caller
|
|
||||||
that production actually executes.
|
|
||||||
|
|
||||||
## Current runtime coverage
|
|
||||||
|
|
||||||
The runtime anchors protect confirmed blocking-IO bug shapes:
|
|
||||||
|
|
||||||
- SQLite checkpointer setup, including path resolution and parent-directory
|
|
||||||
creation.
|
|
||||||
- Subagent skill metadata loading through `SubagentExecutor._load_skills()`.
|
|
||||||
- `JsonlRunEventStore` async API (`put` / `list_*` / `delete_*`): the JSONL
|
|
||||||
run-event backend offloads its synchronous file IO via `asyncio.to_thread`
|
|
||||||
(fix #3084); this anchor drives the real async API under the gate so any
|
|
||||||
blocking IO reintroduced on the loop fails, not only removal of one
|
|
||||||
`to_thread` call.
|
|
||||||
- `UploadsMiddleware.before_agent` uploads-directory scan: a sync-only middleware
|
|
||||||
hook runs on the event loop under async graph execution, so the scan is
|
|
||||||
offloaded via `abefore_agent` + `run_in_executor`.
|
|
||||||
- Gate health checks: Blockbuster catches unoffloaded calls, opt-out works, and
|
|
||||||
patches are restored after exceptions.
|
|
||||||
|
|
||||||
As static detection and review identify more high-risk async paths, add new
|
|
||||||
runtime anchors incrementally.
|
|
||||||
@@ -36,7 +36,6 @@ models:
|
|||||||
- OpenAI (`langchain_openai:ChatOpenAI`)
|
- OpenAI (`langchain_openai:ChatOpenAI`)
|
||||||
- Anthropic (`langchain_anthropic:ChatAnthropic`)
|
- Anthropic (`langchain_anthropic:ChatAnthropic`)
|
||||||
- DeepSeek (`langchain_deepseek:ChatDeepSeek`)
|
- DeepSeek (`langchain_deepseek:ChatDeepSeek`)
|
||||||
- Xiaomi MiMo (`deerflow.models.patched_mimo:PatchedChatMiMo`)
|
|
||||||
- Claude Code OAuth (`deerflow.models.claude_provider:ClaudeChatModel`)
|
- Claude Code OAuth (`deerflow.models.claude_provider:ClaudeChatModel`)
|
||||||
- Codex CLI (`deerflow.models.openai_codex_provider:CodexChatModel`)
|
- Codex CLI (`deerflow.models.openai_codex_provider:CodexChatModel`)
|
||||||
- Any LangChain-compatible provider
|
- Any LangChain-compatible provider
|
||||||
@@ -167,37 +166,6 @@ models:
|
|||||||
|
|
||||||
For Gemini accessed **without** thinking (e.g. via OpenRouter where thinking is not activated), the plain `langchain_openai:ChatOpenAI` with `supports_thinking: false` is sufficient and no patch is needed.
|
For Gemini accessed **without** thinking (e.g. via OpenRouter where thinking is not activated), the plain `langchain_openai:ChatOpenAI` with `supports_thinking: false` is sufficient and no patch is needed.
|
||||||
|
|
||||||
**MiMo with thinking via OpenAI-compatible API**:
|
|
||||||
|
|
||||||
MiMo returns `reasoning_content` on assistant messages in thinking mode. In multi-turn agent conversations with tool calls, subsequent requests must preserve that historical `reasoning_content` on assistant messages or the MiMo API can return HTTP 400. Standard `langchain_openai:ChatOpenAI` drops this provider-specific field, so use `deerflow.models.patched_mimo:PatchedChatMiMo`:
|
|
||||||
|
|
||||||
For pay-as-you-go API keys (`sk-...`), use `https://api.xiaomimimo.com/v1`. For Token Plan keys (`tp-...`), use the regional Token Plan Base URL shown in the MiMo console, such as `https://token-plan-cn.xiaomimimo.com/v1`. MiMo documents these key types as separate and non-interchangeable.
|
|
||||||
|
|
||||||
`PatchedChatMiMo` is model-id agnostic. Use it for every MiMo thinking model entry you configure, including model entries referenced by `subagents.*.model` overrides (for example `mimo-v2.5-pro`, `mimo-v2.5`, `mimo-v2-pro`, `mimo-v2-omni`, or `mimo-v2-flash`).
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
models:
|
|
||||||
- name: mimo-v2.5-pro
|
|
||||||
display_name: MiMo V2.5 Pro
|
|
||||||
use: deerflow.models.patched_mimo:PatchedChatMiMo
|
|
||||||
model: mimo-v2.5-pro
|
|
||||||
api_key: $MIMO_API_KEY
|
|
||||||
base_url: https://api.xiaomimimo.com/v1
|
|
||||||
max_tokens: 8192
|
|
||||||
supports_thinking: true
|
|
||||||
supports_vision: false
|
|
||||||
when_thinking_enabled:
|
|
||||||
extra_body:
|
|
||||||
thinking:
|
|
||||||
type: enabled
|
|
||||||
when_thinking_disabled:
|
|
||||||
extra_body:
|
|
||||||
thinking:
|
|
||||||
type: disabled
|
|
||||||
```
|
|
||||||
|
|
||||||
`PatchedChatMiMo` preserves MiMo's `choices[].message.reasoning_content`, streaming `delta.reasoning_content`, and request-history assistant `reasoning_content` fields. It does not reuse the DeepSeek provider.
|
|
||||||
|
|
||||||
### Tool Groups
|
### Tool Groups
|
||||||
|
|
||||||
Organize tools into logical groups:
|
Organize tools into logical groups:
|
||||||
@@ -351,7 +319,6 @@ models:
|
|||||||
- `OPENAI_API_KEY` - OpenAI API key
|
- `OPENAI_API_KEY` - OpenAI API key
|
||||||
- `ANTHROPIC_API_KEY` - Anthropic API key
|
- `ANTHROPIC_API_KEY` - Anthropic API key
|
||||||
- `DEEPSEEK_API_KEY` - DeepSeek API key
|
- `DEEPSEEK_API_KEY` - DeepSeek API key
|
||||||
- `MIMO_API_KEY` - Xiaomi MiMo API key
|
|
||||||
- `NOVITA_API_KEY` - Novita API key (OpenAI-compatible endpoint)
|
- `NOVITA_API_KEY` - Novita API key (OpenAI-compatible endpoint)
|
||||||
- `TAVILY_API_KEY` - Tavily search API key
|
- `TAVILY_API_KEY` - Tavily search API key
|
||||||
- `DEER_FLOW_PROJECT_ROOT` - Project root for relative runtime paths
|
- `DEER_FLOW_PROJECT_ROOT` - Project root for relative runtime paths
|
||||||
|
|||||||
@@ -14,19 +14,6 @@ DeerFlow supports configurable MCP servers and skills to extend its capabilities
|
|||||||
3. Configure each server’s command, arguments, and environment variables as needed.
|
3. Configure each server’s command, arguments, and environment variables as needed.
|
||||||
4. Restart the application to load and register MCP tools.
|
4. Restart the application to load and register MCP tools.
|
||||||
|
|
||||||
## Filesystem MCP Servers
|
|
||||||
|
|
||||||
DeerFlow already provides built-in file tools for thread-scoped workspace access.
|
|
||||||
Do not add an MCP filesystem server for the same DeerFlow workspace. The
|
|
||||||
overlapping file tools use different path semantics, which can make LLM tool
|
|
||||||
selection and file access behavior unstable.
|
|
||||||
|
|
||||||
DeerFlow does not currently adapt the MCP Roots mode for filesystem servers. In
|
|
||||||
particular, it does not publish per-thread MCP roots or map DeerFlow sandbox
|
|
||||||
paths such as `/mnt/user-data/...` to paths accepted by
|
|
||||||
`@modelcontextprotocol/server-filesystem`. Use DeerFlow's built-in file tools
|
|
||||||
for DeerFlow workspace files.
|
|
||||||
|
|
||||||
## OAuth Support (HTTP/SSE MCP Servers)
|
## OAuth Support (HTTP/SSE MCP Servers)
|
||||||
|
|
||||||
For `http` and `sse` MCP servers, DeerFlow supports OAuth token acquisition and automatic token refresh.
|
For `http` and `sse` MCP servers, DeerFlow supports OAuth token acquisition and automatic token refresh.
|
||||||
@@ -101,6 +88,7 @@ MCP servers expose tools that are automatically discovered and integrated into D
|
|||||||
|
|
||||||
MCP servers can provide access to:
|
MCP servers can provide access to:
|
||||||
|
|
||||||
|
- **File systems**
|
||||||
- **Databases** (e.g., PostgreSQL)
|
- **Databases** (e.g., PostgreSQL)
|
||||||
- **External APIs** (e.g., GitHub, Brave Search)
|
- **External APIs** (e.g., GitHub, Brave Search)
|
||||||
- **Browser automation** (e.g., Puppeteer)
|
- **Browser automation** (e.g., Puppeteer)
|
||||||
@@ -109,4 +97,4 @@ MCP servers can provide access to:
|
|||||||
## Learn More
|
## Learn More
|
||||||
|
|
||||||
For detailed documentation about the Model Context Protocol, visit:
|
For detailed documentation about the Model Context Protocol, visit:
|
||||||
https://modelcontextprotocol.io
|
https://modelcontextprotocol.io
|
||||||
@@ -0,0 +1,401 @@
|
|||||||
|
# Storage Package Design
|
||||||
|
|
||||||
|
## Background
|
||||||
|
|
||||||
|
DeerFlow currently has several persistence responsibilities spread across app, gateway, runtime, and legacy persistence modules. This makes the persistence boundary difficult to reason about and creates several migration risks:
|
||||||
|
|
||||||
|
- Routers and runtime services can accidentally depend on concrete persistence implementations instead of stable contracts.
|
||||||
|
- User/auth, run metadata, thread metadata, feedback, run events, and checkpointer setup are initialized through different paths.
|
||||||
|
- Some persistence behavior is duplicated between memory, SQLite, and PostgreSQL-oriented code paths.
|
||||||
|
- Incremental migration is hard because app-level code and storage-level code are coupled.
|
||||||
|
- Adding or validating another SQL backend requires touching app/runtime code instead of a storage-owned package.
|
||||||
|
|
||||||
|
The storage package is introduced to make application data persistence a package-level capability with explicit contracts, a clear boundary, and SQL backend compatibility.
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
- Provide a standalone `packages/storage` package for durable application data.
|
||||||
|
- Support SQLite, PostgreSQL, and MySQL through a shared persistence construction flow.
|
||||||
|
- Keep LangGraph checkpointer initialization compatible with the same database backend.
|
||||||
|
- Expose repository contracts as the only package-level data access boundary.
|
||||||
|
- Let the app layer depend on app-owned adapters under `app.infra.storage`, not on storage DB implementation classes.
|
||||||
|
- Allow the app/gateway migration to happen in small steps without forcing a large rewrite.
|
||||||
|
|
||||||
|
## Non-Goals
|
||||||
|
|
||||||
|
- This design does not remove legacy persistence in the first PR.
|
||||||
|
- This design does not move routers directly onto storage package models.
|
||||||
|
- This design does not make app routers own SQLAlchemy sessions.
|
||||||
|
- Cron persistence is intentionally out of scope for the storage package foundation.
|
||||||
|
- Memory backend is not part of the durable storage package. Memory compatibility, if still needed by app runtime, belongs outside `packages/storage`.
|
||||||
|
|
||||||
|
## Storage Design Principles
|
||||||
|
|
||||||
|
### Package-Owned Durable Storage
|
||||||
|
|
||||||
|
`packages/storage` owns durable application data persistence. It defines:
|
||||||
|
|
||||||
|
- configuration shape for storage-backed persistence
|
||||||
|
- SQLAlchemy models
|
||||||
|
- repository contracts and DTOs
|
||||||
|
- SQL repository implementations
|
||||||
|
- persistence factory functions
|
||||||
|
- compatibility helpers for config-driven initialization
|
||||||
|
|
||||||
|
The package should be usable without importing `app.gateway`, routers, auth providers, or runtime-specific gateway objects.
|
||||||
|
|
||||||
|
### SQL Backend Compatibility
|
||||||
|
|
||||||
|
The package supports three SQL backends:
|
||||||
|
|
||||||
|
- SQLite for local/single-node deployments
|
||||||
|
- PostgreSQL for production multi-node deployments
|
||||||
|
- MySQL for deployments that standardize on MySQL
|
||||||
|
|
||||||
|
Backend-specific differences are handled inside the storage package:
|
||||||
|
|
||||||
|
- SQLAlchemy async engine URL construction
|
||||||
|
- LangGraph checkpointer connection-string compatibility
|
||||||
|
- JSON metadata filtering across SQLite/PostgreSQL/MySQL
|
||||||
|
- SQL dialect behavior around locking, aggregation, and JSON type semantics
|
||||||
|
|
||||||
|
### Unified Persistence Bundle
|
||||||
|
|
||||||
|
Storage initialization returns an `AppPersistence` bundle:
|
||||||
|
|
||||||
|
```python
|
||||||
|
@dataclass(slots=True)
|
||||||
|
class AppPersistence:
|
||||||
|
checkpointer: Checkpointer
|
||||||
|
engine: AsyncEngine
|
||||||
|
session_factory: async_sessionmaker[AsyncSession]
|
||||||
|
setup: Callable[[], Awaitable[None]]
|
||||||
|
aclose: Callable[[], Awaitable[None]]
|
||||||
|
```
|
||||||
|
|
||||||
|
The app runtime can initialize persistence once, call `setup()`, and then inject:
|
||||||
|
|
||||||
|
- `checkpointer`
|
||||||
|
- `session_factory`
|
||||||
|
- repository adapters
|
||||||
|
|
||||||
|
This keeps checkpointer and application data aligned to the same backend without requiring routers to understand database configuration.
|
||||||
|
|
||||||
|
## Package Layout
|
||||||
|
|
||||||
|
```text
|
||||||
|
backend/packages/storage/
|
||||||
|
store/
|
||||||
|
config/
|
||||||
|
storage_config.py
|
||||||
|
app_config.py
|
||||||
|
persistence/
|
||||||
|
factory.py
|
||||||
|
types.py
|
||||||
|
base_model.py
|
||||||
|
json_compat.py
|
||||||
|
drivers/
|
||||||
|
sqlite.py
|
||||||
|
postgres.py
|
||||||
|
mysql.py
|
||||||
|
repositories/
|
||||||
|
contracts/
|
||||||
|
user.py
|
||||||
|
run.py
|
||||||
|
thread_meta.py
|
||||||
|
feedback.py
|
||||||
|
run_event.py
|
||||||
|
models/
|
||||||
|
user.py
|
||||||
|
run.py
|
||||||
|
thread_meta.py
|
||||||
|
feedback.py
|
||||||
|
run_event.py
|
||||||
|
db/
|
||||||
|
user.py
|
||||||
|
run.py
|
||||||
|
thread_meta.py
|
||||||
|
feedback.py
|
||||||
|
run_event.py
|
||||||
|
factory.py
|
||||||
|
```
|
||||||
|
|
||||||
|
## Persistence Construction
|
||||||
|
|
||||||
|
The primary storage entrypoint is:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from store.persistence import create_persistence_from_storage_config
|
||||||
|
|
||||||
|
persistence = await create_persistence_from_storage_config(storage_config)
|
||||||
|
await persistence.setup()
|
||||||
|
```
|
||||||
|
|
||||||
|
For app-level compatibility with existing database config shape:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from store.persistence import create_persistence_from_database_config
|
||||||
|
|
||||||
|
persistence = await create_persistence_from_database_config(config.database)
|
||||||
|
await persistence.setup()
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected app startup flow:
|
||||||
|
|
||||||
|
```python
|
||||||
|
persistence = await create_persistence_from_database_config(config.database)
|
||||||
|
await persistence.setup()
|
||||||
|
|
||||||
|
app.state.persistence = persistence
|
||||||
|
app.state.checkpointer = persistence.checkpointer
|
||||||
|
app.state.session_factory = persistence.session_factory
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected app shutdown flow:
|
||||||
|
|
||||||
|
```python
|
||||||
|
await app.state.persistence.aclose()
|
||||||
|
```
|
||||||
|
|
||||||
|
## Repository Contract Design
|
||||||
|
|
||||||
|
Repository contracts are the storage package's public data access boundary. They live under `store.repositories.contracts` and are re-exported from `store.repositories`.
|
||||||
|
|
||||||
|
The key contract groups are:
|
||||||
|
|
||||||
|
- `UserRepositoryProtocol`
|
||||||
|
- `RunRepositoryProtocol`
|
||||||
|
- `ThreadMetaRepositoryProtocol`
|
||||||
|
- `FeedbackRepositoryProtocol`
|
||||||
|
- `RunEventRepositoryProtocol`
|
||||||
|
|
||||||
|
Each contract owns:
|
||||||
|
|
||||||
|
- input DTOs, such as `UserCreate`, `RunCreate`, `ThreadMetaCreate`
|
||||||
|
- output DTOs, such as `User`, `Run`, `ThreadMeta`
|
||||||
|
- repository protocol methods
|
||||||
|
- domain-specific exceptions when needed, such as `InvalidMetadataFilterError`
|
||||||
|
|
||||||
|
Repository construction is session-based:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from store.repositories import build_run_repository
|
||||||
|
|
||||||
|
async with persistence.session_factory() as session:
|
||||||
|
repo = build_run_repository(session)
|
||||||
|
run = await repo.get_run(run_id)
|
||||||
|
```
|
||||||
|
|
||||||
|
This keeps transaction ownership explicit. The storage package does not hide commits or session lifecycle inside global singletons.
|
||||||
|
|
||||||
|
## App/Infra Calling Contract
|
||||||
|
|
||||||
|
The app layer should not call `store.repositories.db.*` directly. The intended app boundary is `app.infra.storage`.
|
||||||
|
|
||||||
|
`app.infra.storage` is responsible for:
|
||||||
|
|
||||||
|
- receiving `session_factory` from FastAPI runtime initialization
|
||||||
|
- owning session lifecycle for app-facing repository methods
|
||||||
|
- translating storage DTOs to app/gateway DTOs only when needed
|
||||||
|
- preserving the existing app-facing names during migration
|
||||||
|
- depending on storage repository protocols, not concrete DB classes
|
||||||
|
|
||||||
|
Expected adapter pattern:
|
||||||
|
|
||||||
|
```python
|
||||||
|
class StorageRunRepository(RunRepositoryProtocol):
|
||||||
|
def __init__(self, session_factory):
|
||||||
|
self._session_factory = session_factory
|
||||||
|
|
||||||
|
async def get_run(self, run_id: str):
|
||||||
|
async with self._session_factory() as session:
|
||||||
|
repo = build_run_repository(session)
|
||||||
|
return await repo.get_run(run_id)
|
||||||
|
```
|
||||||
|
|
||||||
|
For gateway compatibility, app state can keep existing names while the implementation changes:
|
||||||
|
|
||||||
|
```python
|
||||||
|
app.state.run_store = StorageRunStore(run_repository)
|
||||||
|
app.state.feedback_repo = StorageFeedbackStore(feedback_repository)
|
||||||
|
app.state.thread_store = StorageThreadMetaStore(thread_meta_repository)
|
||||||
|
app.state.run_event_store = StorageRunEventStore(run_event_repository)
|
||||||
|
app.state.checkpointer = persistence.checkpointer
|
||||||
|
app.state.session_factory = persistence.session_factory
|
||||||
|
```
|
||||||
|
|
||||||
|
The app-facing objects may expose legacy method names during migration, but their internal data access should go through storage contracts.
|
||||||
|
|
||||||
|
## Boundary Rules
|
||||||
|
|
||||||
|
### Allowed Calls
|
||||||
|
|
||||||
|
Storage package callers may use:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from store.persistence import create_persistence_from_database_config
|
||||||
|
from store.persistence import create_persistence_from_storage_config
|
||||||
|
from store.repositories import build_run_repository
|
||||||
|
from store.repositories import build_user_repository
|
||||||
|
from store.repositories import build_thread_meta_repository
|
||||||
|
from store.repositories import build_feedback_repository
|
||||||
|
from store.repositories import build_run_event_repository
|
||||||
|
from store.repositories import RunRepositoryProtocol
|
||||||
|
from store.repositories import UserRepositoryProtocol
|
||||||
|
```
|
||||||
|
|
||||||
|
App layer callers should use:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from app.infra.storage import StorageRunRepository
|
||||||
|
from app.infra.storage import StorageUserDataRepository
|
||||||
|
from app.infra.storage import StorageThreadMetaRepository
|
||||||
|
from app.infra.storage import StorageFeedbackRepository
|
||||||
|
from app.infra.storage import StorageRunEventRepository
|
||||||
|
```
|
||||||
|
|
||||||
|
### Prohibited Calls
|
||||||
|
|
||||||
|
App/gateway/router/auth code must not import:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from store.repositories.db import DbRunRepository
|
||||||
|
from store.repositories.models import Run
|
||||||
|
from store.persistence.base_model import MappedBase
|
||||||
|
```
|
||||||
|
|
||||||
|
Routers must not:
|
||||||
|
|
||||||
|
- create SQLAlchemy engines
|
||||||
|
- create SQLAlchemy sessions directly
|
||||||
|
- call storage DB repository classes directly
|
||||||
|
- commit/rollback storage transactions directly unless explicitly scoped by an infra adapter
|
||||||
|
- depend on storage SQLAlchemy model classes
|
||||||
|
|
||||||
|
Storage package code must not import:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import app.gateway
|
||||||
|
import app.infra
|
||||||
|
import deerflow.runtime
|
||||||
|
```
|
||||||
|
|
||||||
|
The dependency direction is:
|
||||||
|
|
||||||
|
```text
|
||||||
|
app/gateway -> app.infra.storage -> packages/storage contracts/factories -> packages/storage db implementations
|
||||||
|
```
|
||||||
|
|
||||||
|
The reverse direction is forbidden.
|
||||||
|
|
||||||
|
## Checkpointer Compatibility
|
||||||
|
|
||||||
|
The storage persistence bundle initializes the LangGraph checkpointer alongside application data persistence.
|
||||||
|
|
||||||
|
Backend-specific notes:
|
||||||
|
|
||||||
|
- SQLite uses `langgraph-checkpoint-sqlite`.
|
||||||
|
- PostgreSQL uses `langgraph-checkpoint-postgres` and requires a string `postgresql://...` connection URL.
|
||||||
|
- MySQL uses `langgraph-checkpoint-mysql` and requires a string MySQL connection URL.
|
||||||
|
|
||||||
|
SQLAlchemy may use async driver URLs such as `postgresql+asyncpg://...` or `mysql+aiomysql://...`, but LangGraph checkpointer constructors expect plain string connection URLs. This conversion belongs inside the storage driver implementation.
|
||||||
|
|
||||||
|
## JSON Metadata Filtering
|
||||||
|
|
||||||
|
Thread metadata search supports dialect-aware JSON filtering through `store.persistence.json_compat`.
|
||||||
|
|
||||||
|
The matcher supports:
|
||||||
|
|
||||||
|
- `None`
|
||||||
|
- `bool`
|
||||||
|
- `int`
|
||||||
|
- `float`
|
||||||
|
- `str`
|
||||||
|
|
||||||
|
It rejects:
|
||||||
|
|
||||||
|
- unsafe keys
|
||||||
|
- nested JSON path expressions
|
||||||
|
- dict/list values
|
||||||
|
- integers outside signed 64-bit range
|
||||||
|
|
||||||
|
This prevents SQL/JSON path injection, avoids compiled-cache type drift, and preserves type semantics such as `True != 1` and explicit JSON `null` not matching a missing key.
|
||||||
|
|
||||||
|
## Step-by-Step Implementation Plan
|
||||||
|
|
||||||
|
### Step 1: Introduce Storage Package Foundation
|
||||||
|
|
||||||
|
- Add `backend/packages/storage`.
|
||||||
|
- Add storage config models.
|
||||||
|
- Add `AppPersistence`.
|
||||||
|
- Add SQLite/PostgreSQL/MySQL persistence drivers.
|
||||||
|
- Add repository contracts, models, DB implementations, and factory helpers.
|
||||||
|
- Add package dependency wiring.
|
||||||
|
- Exclude cron persistence.
|
||||||
|
|
||||||
|
### Step 2: Harden Storage Backend Compatibility
|
||||||
|
|
||||||
|
- Validate SQLite setup and repository behavior.
|
||||||
|
- Validate PostgreSQL and MySQL with local E2E tests.
|
||||||
|
- Fix checkpointer connection-string compatibility.
|
||||||
|
- Fix PostgreSQL locking and aggregation differences.
|
||||||
|
- Add dialect-aware JSON metadata filtering.
|
||||||
|
|
||||||
|
### Step 3: Add App Infra Adapters
|
||||||
|
|
||||||
|
- Add `backend/app/infra/storage`.
|
||||||
|
- Implement app-facing repositories that own session lifecycle.
|
||||||
|
- Keep storage contracts as the only data access boundary.
|
||||||
|
- Add legacy compatibility adapters for existing app/gateway method shapes.
|
||||||
|
- Keep app/gateway imports out of `packages/storage`.
|
||||||
|
|
||||||
|
### Step 4: Switch FastAPI Runtime Injection
|
||||||
|
|
||||||
|
- Initialize storage persistence in FastAPI startup/lifespan.
|
||||||
|
- Attach `persistence`, `checkpointer`, and `session_factory` to `app.state`.
|
||||||
|
- Preserve existing external state names:
|
||||||
|
- `run_store`
|
||||||
|
- `feedback_repo`
|
||||||
|
- `thread_store`
|
||||||
|
- `run_event_store`
|
||||||
|
- `checkpointer`
|
||||||
|
- `session_factory`
|
||||||
|
- Start with user/auth provider construction, then migrate run/thread/feedback/run_event.
|
||||||
|
|
||||||
|
### Step 5: Router and Auth Compatibility
|
||||||
|
|
||||||
|
- Ensure routers consume app-facing adapters, not storage DB classes.
|
||||||
|
- Ensure auth providers depend on user repository contracts.
|
||||||
|
- Keep router response shapes unchanged.
|
||||||
|
- Add focused auth/admin/router regression tests.
|
||||||
|
|
||||||
|
### Step 6: Cleanup Legacy Persistence
|
||||||
|
|
||||||
|
- Compare old persistence usage after app/gateway migration.
|
||||||
|
- Remove unused old repository implementations only after all call sites move.
|
||||||
|
- Keep compatibility shims only where needed for a transition window.
|
||||||
|
- Delete memory backend paths from storage-owned durable persistence.
|
||||||
|
|
||||||
|
## Testing Strategy
|
||||||
|
|
||||||
|
Unit tests should cover:
|
||||||
|
|
||||||
|
- config parsing
|
||||||
|
- persistence setup
|
||||||
|
- table creation
|
||||||
|
- repository CRUD/query behavior
|
||||||
|
- typed JSON metadata filtering
|
||||||
|
- dialect SQL compilation
|
||||||
|
- cron exclusion
|
||||||
|
|
||||||
|
E2E tests should cover:
|
||||||
|
|
||||||
|
- SQLite persistence setup
|
||||||
|
- PostgreSQL temporary database setup
|
||||||
|
- MySQL temporary database setup
|
||||||
|
- repository contract behavior across all supported SQL backends
|
||||||
|
- JSON/Unicode round trip
|
||||||
|
- rollback behavior
|
||||||
|
- persistence close/cleanup
|
||||||
|
|
||||||
|
E2E tests may remain local-only if CI does not provide PostgreSQL/MySQL services.
|
||||||
@@ -0,0 +1,401 @@
|
|||||||
|
# Storage Package 设计文档
|
||||||
|
|
||||||
|
## 背景
|
||||||
|
|
||||||
|
DeerFlow 当前有多类持久化职责分散在 app、gateway、runtime 和旧 persistence 模块中。这会带来几个问题:
|
||||||
|
|
||||||
|
- routers 和 runtime services 容易依赖具体 persistence 实现,而不是稳定契约。
|
||||||
|
- user/auth、run metadata、thread metadata、feedback、run events、checkpointer setup 的初始化路径不统一。
|
||||||
|
- memory、SQLite、PostgreSQL 相关路径中存在部分重复逻辑。
|
||||||
|
- app 层代码和 storage 层代码耦合,导致增量迁移困难。
|
||||||
|
- 增加或验证新的 SQL backend 时,需要改动 app/runtime,而不是只改 storage package。
|
||||||
|
|
||||||
|
引入 storage package 的目标,是把应用数据持久化抽象成 package 级能力,并提供明确契约、清晰边界和 SQL backend 兼容性。
|
||||||
|
|
||||||
|
## 目标
|
||||||
|
|
||||||
|
- 新增独立的 `packages/storage`,负责 durable application data。
|
||||||
|
- 通过统一 persistence 构造流程支持 SQLite、PostgreSQL、MySQL。
|
||||||
|
- 保持 LangGraph checkpointer 与同一个数据库 backend 兼容。
|
||||||
|
- 将 repository contracts 作为 package 对外唯一数据访问边界。
|
||||||
|
- app 层通过 `app.infra.storage` 适配 storage,而不是直接依赖 storage DB 实现类。
|
||||||
|
- 支持 app/gateway 后续小步迁移,避免一次性大重构。
|
||||||
|
|
||||||
|
## 非目标
|
||||||
|
|
||||||
|
- 第一阶段不删除旧 persistence。
|
||||||
|
- 不让 routers 直接依赖 storage package models。
|
||||||
|
- 不让 app routers 管理 SQLAlchemy sessions。
|
||||||
|
- cron persistence 不属于 storage package 基础迁移范围。
|
||||||
|
- memory backend 不属于 durable storage package。若 app runtime 仍需要 memory 兼容,应放在 `packages/storage` 之外。
|
||||||
|
|
||||||
|
## Storage 设计理念
|
||||||
|
|
||||||
|
### Package 自己负责 Durable Storage
|
||||||
|
|
||||||
|
`packages/storage` 负责应用数据的 durable persistence,包括:
|
||||||
|
|
||||||
|
- storage 持久化配置
|
||||||
|
- SQLAlchemy models
|
||||||
|
- repository contracts 和 DTOs
|
||||||
|
- SQL repository 实现
|
||||||
|
- persistence factory functions
|
||||||
|
- 面向现有 config 的兼容初始化入口
|
||||||
|
|
||||||
|
该 package 不应该 import `app.gateway`、routers、auth providers 或 runtime 中的 gateway 对象。
|
||||||
|
|
||||||
|
### SQL Backend 兼容
|
||||||
|
|
||||||
|
该 package 支持三种 SQL backend:
|
||||||
|
|
||||||
|
- SQLite:本地或单节点部署
|
||||||
|
- PostgreSQL:生产多节点部署
|
||||||
|
- MySQL:使用 MySQL 作为标准数据库的部署
|
||||||
|
|
||||||
|
backend 差异在 storage package 内部处理:
|
||||||
|
|
||||||
|
- SQLAlchemy async engine URL 构造
|
||||||
|
- LangGraph checkpointer 连接串兼容
|
||||||
|
- SQLite/PostgreSQL/MySQL 的 JSON metadata filter
|
||||||
|
- 不同 SQL 方言在 locking、aggregation、JSON 类型语义上的差异
|
||||||
|
|
||||||
|
### 统一 Persistence Bundle
|
||||||
|
|
||||||
|
Storage 初始化返回 `AppPersistence` bundle:
|
||||||
|
|
||||||
|
```python
|
||||||
|
@dataclass(slots=True)
|
||||||
|
class AppPersistence:
|
||||||
|
checkpointer: Checkpointer
|
||||||
|
engine: AsyncEngine
|
||||||
|
session_factory: async_sessionmaker[AsyncSession]
|
||||||
|
setup: Callable[[], Awaitable[None]]
|
||||||
|
aclose: Callable[[], Awaitable[None]]
|
||||||
|
```
|
||||||
|
|
||||||
|
app runtime 只需要初始化一次 persistence,调用 `setup()`,然后注入:
|
||||||
|
|
||||||
|
- `checkpointer`
|
||||||
|
- `session_factory`
|
||||||
|
- repository adapters
|
||||||
|
|
||||||
|
这样 checkpointer 和应用数据可以对齐到同一个 backend,同时 routers 不需要理解数据库配置。
|
||||||
|
|
||||||
|
## Package 结构
|
||||||
|
|
||||||
|
```text
|
||||||
|
backend/packages/storage/
|
||||||
|
store/
|
||||||
|
config/
|
||||||
|
storage_config.py
|
||||||
|
app_config.py
|
||||||
|
persistence/
|
||||||
|
factory.py
|
||||||
|
types.py
|
||||||
|
base_model.py
|
||||||
|
json_compat.py
|
||||||
|
drivers/
|
||||||
|
sqlite.py
|
||||||
|
postgres.py
|
||||||
|
mysql.py
|
||||||
|
repositories/
|
||||||
|
contracts/
|
||||||
|
user.py
|
||||||
|
run.py
|
||||||
|
thread_meta.py
|
||||||
|
feedback.py
|
||||||
|
run_event.py
|
||||||
|
models/
|
||||||
|
user.py
|
||||||
|
run.py
|
||||||
|
thread_meta.py
|
||||||
|
feedback.py
|
||||||
|
run_event.py
|
||||||
|
db/
|
||||||
|
user.py
|
||||||
|
run.py
|
||||||
|
thread_meta.py
|
||||||
|
feedback.py
|
||||||
|
run_event.py
|
||||||
|
factory.py
|
||||||
|
```
|
||||||
|
|
||||||
|
## Persistence 构造
|
||||||
|
|
||||||
|
storage 的主要入口:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from store.persistence import create_persistence_from_storage_config
|
||||||
|
|
||||||
|
persistence = await create_persistence_from_storage_config(storage_config)
|
||||||
|
await persistence.setup()
|
||||||
|
```
|
||||||
|
|
||||||
|
为了兼容现有 app database config,也提供:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from store.persistence import create_persistence_from_database_config
|
||||||
|
|
||||||
|
persistence = await create_persistence_from_database_config(config.database)
|
||||||
|
await persistence.setup()
|
||||||
|
```
|
||||||
|
|
||||||
|
预期 app startup 流程:
|
||||||
|
|
||||||
|
```python
|
||||||
|
persistence = await create_persistence_from_database_config(config.database)
|
||||||
|
await persistence.setup()
|
||||||
|
|
||||||
|
app.state.persistence = persistence
|
||||||
|
app.state.checkpointer = persistence.checkpointer
|
||||||
|
app.state.session_factory = persistence.session_factory
|
||||||
|
```
|
||||||
|
|
||||||
|
预期 app shutdown 流程:
|
||||||
|
|
||||||
|
```python
|
||||||
|
await app.state.persistence.aclose()
|
||||||
|
```
|
||||||
|
|
||||||
|
## Repository 契约设计
|
||||||
|
|
||||||
|
Repository contracts 是 storage package 对外公开的数据访问边界。它们位于 `store.repositories.contracts`,并通过 `store.repositories` re-export。
|
||||||
|
|
||||||
|
主要契约包括:
|
||||||
|
|
||||||
|
- `UserRepositoryProtocol`
|
||||||
|
- `RunRepositoryProtocol`
|
||||||
|
- `ThreadMetaRepositoryProtocol`
|
||||||
|
- `FeedbackRepositoryProtocol`
|
||||||
|
- `RunEventRepositoryProtocol`
|
||||||
|
|
||||||
|
每组契约包含:
|
||||||
|
|
||||||
|
- 输入 DTO,例如 `UserCreate`、`RunCreate`、`ThreadMetaCreate`
|
||||||
|
- 输出 DTO,例如 `User`、`Run`、`ThreadMeta`
|
||||||
|
- repository protocol methods
|
||||||
|
- 必要的领域异常,例如 `InvalidMetadataFilterError`
|
||||||
|
|
||||||
|
Repository 通过 session 构造:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from store.repositories import build_run_repository
|
||||||
|
|
||||||
|
async with persistence.session_factory() as session:
|
||||||
|
repo = build_run_repository(session)
|
||||||
|
run = await repo.get_run(run_id)
|
||||||
|
```
|
||||||
|
|
||||||
|
这样可以让 transaction ownership 保持明确。storage package 不通过全局 singleton 隐式隐藏 commit 或 session 生命周期。
|
||||||
|
|
||||||
|
## App/Infra 调用契约
|
||||||
|
|
||||||
|
app 层不应该直接调用 `store.repositories.db.*`。预期的 app 边界是 `app.infra.storage`。
|
||||||
|
|
||||||
|
`app.infra.storage` 负责:
|
||||||
|
|
||||||
|
- 从 FastAPI runtime 初始化中接收 `session_factory`
|
||||||
|
- 为 app-facing repository methods 管理 session 生命周期
|
||||||
|
- 在必要时将 storage DTOs 转成 app/gateway DTOs
|
||||||
|
- 迁移期间保留现有 app-facing 名称
|
||||||
|
- 依赖 storage repository protocols,而不是具体 DB classes
|
||||||
|
|
||||||
|
预期 adapter 模式:
|
||||||
|
|
||||||
|
```python
|
||||||
|
class StorageRunRepository(RunRepositoryProtocol):
|
||||||
|
def __init__(self, session_factory):
|
||||||
|
self._session_factory = session_factory
|
||||||
|
|
||||||
|
async def get_run(self, run_id: str):
|
||||||
|
async with self._session_factory() as session:
|
||||||
|
repo = build_run_repository(session)
|
||||||
|
return await repo.get_run(run_id)
|
||||||
|
```
|
||||||
|
|
||||||
|
为了兼容 gateway,app state 可以暂时保持现有名字,只替换内部实现:
|
||||||
|
|
||||||
|
```python
|
||||||
|
app.state.run_store = StorageRunStore(run_repository)
|
||||||
|
app.state.feedback_repo = StorageFeedbackStore(feedback_repository)
|
||||||
|
app.state.thread_store = StorageThreadMetaStore(thread_meta_repository)
|
||||||
|
app.state.run_event_store = StorageRunEventStore(run_event_repository)
|
||||||
|
app.state.checkpointer = persistence.checkpointer
|
||||||
|
app.state.session_factory = persistence.session_factory
|
||||||
|
```
|
||||||
|
|
||||||
|
app-facing objects 可以在迁移期间保留旧方法名,但内部数据访问必须经过 storage contracts。
|
||||||
|
|
||||||
|
## 边界规则
|
||||||
|
|
||||||
|
### 允许调用的范围
|
||||||
|
|
||||||
|
storage package 调用方可以使用:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from store.persistence import create_persistence_from_database_config
|
||||||
|
from store.persistence import create_persistence_from_storage_config
|
||||||
|
from store.repositories import build_run_repository
|
||||||
|
from store.repositories import build_user_repository
|
||||||
|
from store.repositories import build_thread_meta_repository
|
||||||
|
from store.repositories import build_feedback_repository
|
||||||
|
from store.repositories import build_run_event_repository
|
||||||
|
from store.repositories import RunRepositoryProtocol
|
||||||
|
from store.repositories import UserRepositoryProtocol
|
||||||
|
```
|
||||||
|
|
||||||
|
app 层应该使用:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from app.infra.storage import StorageRunRepository
|
||||||
|
from app.infra.storage import StorageUserDataRepository
|
||||||
|
from app.infra.storage import StorageThreadMetaRepository
|
||||||
|
from app.infra.storage import StorageFeedbackRepository
|
||||||
|
from app.infra.storage import StorageRunEventRepository
|
||||||
|
```
|
||||||
|
|
||||||
|
### 禁止调用的范围
|
||||||
|
|
||||||
|
app/gateway/router/auth 代码不应该 import:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from store.repositories.db import DbRunRepository
|
||||||
|
from store.repositories.models import Run
|
||||||
|
from store.persistence.base_model import MappedBase
|
||||||
|
```
|
||||||
|
|
||||||
|
routers 禁止:
|
||||||
|
|
||||||
|
- 创建 SQLAlchemy engines
|
||||||
|
- 直接创建 SQLAlchemy sessions
|
||||||
|
- 直接调用 storage DB repository classes
|
||||||
|
- 直接 commit/rollback storage transactions,除非这是 infra adapter 明确管理的范围
|
||||||
|
- 依赖 storage SQLAlchemy model classes
|
||||||
|
|
||||||
|
storage package 禁止 import:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import app.gateway
|
||||||
|
import app.infra
|
||||||
|
import deerflow.runtime
|
||||||
|
```
|
||||||
|
|
||||||
|
依赖方向必须是:
|
||||||
|
|
||||||
|
```text
|
||||||
|
app/gateway -> app.infra.storage -> packages/storage contracts/factories -> packages/storage db implementations
|
||||||
|
```
|
||||||
|
|
||||||
|
禁止反向依赖。
|
||||||
|
|
||||||
|
## Checkpointer 兼容
|
||||||
|
|
||||||
|
storage persistence bundle 会同时初始化 LangGraph checkpointer 和应用数据持久化。
|
||||||
|
|
||||||
|
backend 说明:
|
||||||
|
|
||||||
|
- SQLite 使用 `langgraph-checkpoint-sqlite`。
|
||||||
|
- PostgreSQL 使用 `langgraph-checkpoint-postgres`,需要字符串形式的 `postgresql://...` 连接串。
|
||||||
|
- MySQL 使用 `langgraph-checkpoint-mysql`,需要字符串形式的 MySQL 连接串。
|
||||||
|
|
||||||
|
SQLAlchemy 可以使用 `postgresql+asyncpg://...` 或 `mysql+aiomysql://...` 这类 async driver URL,但 LangGraph checkpointer 构造函数需要普通字符串连接串。这个转换应该封装在 storage driver implementation 内部。
|
||||||
|
|
||||||
|
## JSON Metadata Filtering
|
||||||
|
|
||||||
|
Thread metadata search 通过 `store.persistence.json_compat` 支持跨方言 JSON filtering。
|
||||||
|
|
||||||
|
支持的 filter value 类型:
|
||||||
|
|
||||||
|
- `None`
|
||||||
|
- `bool`
|
||||||
|
- `int`
|
||||||
|
- `float`
|
||||||
|
- `str`
|
||||||
|
|
||||||
|
拒绝:
|
||||||
|
|
||||||
|
- unsafe keys
|
||||||
|
- nested JSON path expressions
|
||||||
|
- dict/list values
|
||||||
|
- 超出 signed 64-bit 范围的整数
|
||||||
|
|
||||||
|
这样可以避免 SQL/JSON path injection,避免 compiled-cache 类型漂移,并保留类型语义,例如 `True != 1`,显式 JSON `null` 不等于 missing key。
|
||||||
|
|
||||||
|
## 分步实现方案
|
||||||
|
|
||||||
|
### 第 1 步:新增 Storage Package 基础
|
||||||
|
|
||||||
|
- 新增 `backend/packages/storage`。
|
||||||
|
- 增加 storage config models。
|
||||||
|
- 增加 `AppPersistence`。
|
||||||
|
- 增加 SQLite/PostgreSQL/MySQL persistence drivers。
|
||||||
|
- 增加 repository contracts、models、DB implementations 和 factory helpers。
|
||||||
|
- 接入 package dependency。
|
||||||
|
- 排除 cron persistence。
|
||||||
|
|
||||||
|
### 第 2 步:补齐 Storage Backend 兼容性
|
||||||
|
|
||||||
|
- 验证 SQLite setup 和 repository 行为。
|
||||||
|
- 使用本地 E2E 验证 PostgreSQL 和 MySQL。
|
||||||
|
- 修复 checkpointer 连接串兼容。
|
||||||
|
- 修复 PostgreSQL locking 和 aggregation 差异。
|
||||||
|
- 增加跨方言 JSON metadata filtering。
|
||||||
|
|
||||||
|
### 第 3 步:新增 App Infra Adapters
|
||||||
|
|
||||||
|
- 新增 `backend/app/infra/storage`。
|
||||||
|
- 实现 app-facing repositories,由它们管理 session 生命周期。
|
||||||
|
- 保持 storage contracts 作为唯一数据访问边界。
|
||||||
|
- 为现有 app/gateway method shape 增加兼容 adapters。
|
||||||
|
- 避免 `packages/storage` import app/gateway。
|
||||||
|
|
||||||
|
### 第 4 步:切换 FastAPI Runtime 注入
|
||||||
|
|
||||||
|
- 在 FastAPI startup/lifespan 中初始化 storage persistence。
|
||||||
|
- 将 `persistence`、`checkpointer`、`session_factory` 注入 `app.state`。
|
||||||
|
- 暂时保留现有对外 state 名称:
|
||||||
|
- `run_store`
|
||||||
|
- `feedback_repo`
|
||||||
|
- `thread_store`
|
||||||
|
- `run_event_store`
|
||||||
|
- `checkpointer`
|
||||||
|
- `session_factory`
|
||||||
|
- 先切 user/auth provider 构造,再逐步迁移 run/thread/feedback/run_event。
|
||||||
|
|
||||||
|
### 第 5 步:Router 和 Auth 兼容
|
||||||
|
|
||||||
|
- 确保 routers 消费 app-facing adapters,而不是 storage DB classes。
|
||||||
|
- 确保 auth providers 依赖 user repository contracts。
|
||||||
|
- 保持 router response shapes 不变。
|
||||||
|
- 增加 auth/admin/router regression tests。
|
||||||
|
|
||||||
|
### 第 6 步:清理旧 Persistence
|
||||||
|
|
||||||
|
- app/gateway 迁移完成后,再比较旧 persistence usage。
|
||||||
|
- 所有 call sites 迁移完成后,再删除未使用的旧 repository implementations。
|
||||||
|
- 只在必要时保留短期 compatibility shims。
|
||||||
|
- 从 storage-owned durable persistence 中移除 memory backend 路径。
|
||||||
|
|
||||||
|
## 测试策略
|
||||||
|
|
||||||
|
单测应覆盖:
|
||||||
|
|
||||||
|
- config parsing
|
||||||
|
- persistence setup
|
||||||
|
- table creation
|
||||||
|
- repository CRUD/query behavior
|
||||||
|
- typed JSON metadata filtering
|
||||||
|
- dialect SQL compilation
|
||||||
|
- cron exclusion
|
||||||
|
|
||||||
|
E2E 应覆盖:
|
||||||
|
|
||||||
|
- SQLite persistence setup
|
||||||
|
- PostgreSQL temporary database setup
|
||||||
|
- MySQL temporary database setup
|
||||||
|
- 所有支持 SQL backend 下的 repository contract 行为
|
||||||
|
- JSON/Unicode round trip
|
||||||
|
- rollback behavior
|
||||||
|
- persistence close/cleanup
|
||||||
|
|
||||||
|
如果 CI 暂时没有 PostgreSQL/MySQL services,E2E 可以先作为 local-only 验证保留。
|
||||||
@@ -26,7 +26,7 @@
|
|||||||
- Replace sync `requests` with `httpx.AsyncClient` in community tools (tavily, jina_ai, firecrawl, infoquest, image_search)
|
- Replace sync `requests` with `httpx.AsyncClient` in community tools (tavily, jina_ai, firecrawl, infoquest, image_search)
|
||||||
- [x] Replace sync `model.invoke()` with async `model.ainvoke()` in title_middleware and memory updater
|
- [x] Replace sync `model.invoke()` with async `model.ainvoke()` in title_middleware and memory updater
|
||||||
- Consider `asyncio.to_thread()` wrapper for remaining blocking file I/O
|
- Consider `asyncio.to_thread()` wrapper for remaining blocking file I/O
|
||||||
- For production: tune Gateway worker/runtime settings for long-running agent workloads
|
- For production: use `langgraph up` (multi-worker) instead of `langgraph dev` (single-worker)
|
||||||
|
|
||||||
## Resolved Issues
|
## Resolved Issues
|
||||||
|
|
||||||
|
|||||||
@@ -4,22 +4,22 @@
|
|||||||
|
|
||||||
`create_deerflow_agent` 通过 `RuntimeFeatures` 组装的完整 middleware 链(默认全开时):
|
`create_deerflow_agent` 通过 `RuntimeFeatures` 组装的完整 middleware 链(默认全开时):
|
||||||
|
|
||||||
| # | Middleware | `before_agent` | `before_model` | `after_model` | `after_agent` | `wrap_model_call` | `wrap_tool_call` | 主 Agent | Subagent | 来源 |
|
| # | Middleware | `before_agent` | `before_model` | `after_model` | `after_agent` | `wrap_tool_call` | 主 Agent | Subagent | 来源 |
|
||||||
|---|-----------|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|------|
|
|---|-----------|:-:|:-:|:-:|:-:|:-:|:-:|:-:|------|
|
||||||
| 0 | ThreadDataMiddleware | ✓ | | | | | | ✓ | ✓ | `sandbox` |
|
| 0 | ThreadDataMiddleware | ✓ | | | | | ✓ | ✓ | `sandbox` |
|
||||||
| 1 | UploadsMiddleware | ✓ | | | | | | ✓ | ✗ | `sandbox` |
|
| 1 | UploadsMiddleware | ✓ | | | | | ✓ | ✗ | `sandbox` |
|
||||||
| 2 | SandboxMiddleware | ✓ | | | ✓ | | | ✓ | ✓ | `sandbox` |
|
| 2 | SandboxMiddleware | ✓ | | | ✓ | | ✓ | ✓ | `sandbox` |
|
||||||
| 3 | DanglingToolCallMiddleware | | | | | ✓ | | ✓ | ✗ | 始终开启 |
|
| 3 | DanglingToolCallMiddleware | | | ✓ | | | ✓ | ✗ | 始终开启 |
|
||||||
| 4 | GuardrailMiddleware | | | | | | ✓ | ✓ | ✓ | *Phase 2 纳入* |
|
| 4 | GuardrailMiddleware | | | | | ✓ | ✓ | ✓ | *Phase 2 纳入* |
|
||||||
| 5 | ToolErrorHandlingMiddleware | | | | | | ✓ | ✓ | ✓ | 始终开启 |
|
| 5 | ToolErrorHandlingMiddleware | | | | | ✓ | ✓ | ✓ | 始终开启 |
|
||||||
| 6 | SummarizationMiddleware | | ✓ | | | | | ✓ | ✗ | `summarization` |
|
| 6 | SummarizationMiddleware | | | ✓ | | | ✓ | ✗ | `summarization` |
|
||||||
| 7 | TodoMiddleware | | ✓ | ✓ | | ✓ | | ✓ | ✗ | `plan_mode` 参数 |
|
| 7 | TodoMiddleware | | | ✓ | | | ✓ | ✗ | `plan_mode` 参数 |
|
||||||
| 8 | TitleMiddleware | | | ✓ | | | | ✓ | ✗ | `auto_title` |
|
| 8 | TitleMiddleware | | | ✓ | | | ✓ | ✗ | `auto_title` |
|
||||||
| 9 | MemoryMiddleware | | | | ✓ | | | ✓ | ✗ | `memory` |
|
| 9 | MemoryMiddleware | | | | ✓ | | ✓ | ✗ | `memory` |
|
||||||
| 10 | ViewImageMiddleware | | ✓ | | | | | ✓ | ✗ | `vision` |
|
| 10 | ViewImageMiddleware | | ✓ | | | | ✓ | ✗ | `vision` |
|
||||||
| 11 | SubagentLimitMiddleware | | | ✓ | | | | ✓ | ✗ | `subagent` |
|
| 11 | SubagentLimitMiddleware | | | ✓ | | | ✓ | ✗ | `subagent` |
|
||||||
| 12 | LoopDetectionMiddleware | ✓ | | ✓ | ✓ | ✓ | | ✓ | ✗ | 始终开启 |
|
| 12 | LoopDetectionMiddleware | | | ✓ | | | ✓ | ✗ | 始终开启 |
|
||||||
| 13 | ClarificationMiddleware | | | | | | ✓ | ✓ | ✗ | 始终最后 |
|
| 13 | ClarificationMiddleware | | | ✓ | | | ✓ | ✗ | 始终最后 |
|
||||||
|
|
||||||
主 agent **14 个** middleware(`make_lead_agent`),subagent **4 个**(ThreadData、Sandbox、Guardrail、ToolErrorHandling)。`create_deerflow_agent` Phase 1 实现 **13 个**(Guardrail 仅支持自定义实例,无内置默认)。
|
主 agent **14 个** middleware(`make_lead_agent`),subagent **4 个**(ThreadData、Sandbox、Guardrail、ToolErrorHandling)。`create_deerflow_agent` Phase 1 实现 **13 个**(Guardrail 仅支持自定义实例,无内置默认)。
|
||||||
|
|
||||||
@@ -35,7 +35,7 @@ graph TB
|
|||||||
|
|
||||||
subgraph BA ["<b>before_agent</b> 正序 0→N"]
|
subgraph BA ["<b>before_agent</b> 正序 0→N"]
|
||||||
direction TB
|
direction TB
|
||||||
TD["[0] ThreadData<br/>创建线程目录"] --> UL["[1] Uploads<br/>扫描上传文件"] --> SB["[2] Sandbox<br/>获取沙箱"] --> LD_BA["[12] LoopDetection<br/>清理 stale warning"]
|
TD["[0] ThreadData<br/>创建线程目录"] --> UL["[1] Uploads<br/>扫描上传文件"] --> SB["[2] Sandbox<br/>获取沙箱"]
|
||||||
end
|
end
|
||||||
|
|
||||||
subgraph BM ["<b>before_model</b> 正序 0→N"]
|
subgraph BM ["<b>before_model</b> 正序 0→N"]
|
||||||
@@ -43,42 +43,34 @@ graph TB
|
|||||||
VI["[10] ViewImage<br/>注入图片 base64"]
|
VI["[10] ViewImage<br/>注入图片 base64"]
|
||||||
end
|
end
|
||||||
|
|
||||||
subgraph WM ["<b>wrap_model_call</b>"]
|
SB --> VI
|
||||||
direction TB
|
VI --> M["<b>MODEL</b>"]
|
||||||
DTC_WM["[3] DanglingToolCall<br/>补悬空 ToolMessage"] --> LD_WM["[12] LoopDetection<br/>注入当前 run warning"]
|
|
||||||
end
|
|
||||||
|
|
||||||
LD_BA --> VI
|
|
||||||
VI --> DTC_WM
|
|
||||||
LD_WM --> M["<b>MODEL</b>"]
|
|
||||||
|
|
||||||
subgraph AM ["<b>after_model</b> 反序 N→0"]
|
subgraph AM ["<b>after_model</b> 反序 N→0"]
|
||||||
direction TB
|
direction TB
|
||||||
LD["[12] LoopDetection<br/>检测循环/排队 warning"] --> SL["[11] SubagentLimit<br/>截断多余 task"] --> TI["[8] Title<br/>生成标题"]
|
CL["[13] Clarification<br/>拦截 ask_clarification"] --> LD["[12] LoopDetection<br/>检测循环"] --> SL["[11] SubagentLimit<br/>截断多余 task"] --> TI["[8] Title<br/>生成标题"] --> SM["[6] Summarization<br/>上下文压缩"] --> DTC["[3] DanglingToolCall<br/>补缺失 ToolMessage"]
|
||||||
end
|
end
|
||||||
|
|
||||||
M --> LD
|
M --> CL
|
||||||
|
|
||||||
subgraph AA ["<b>after_agent</b> 反序 N→0"]
|
subgraph AA ["<b>after_agent</b> 反序 N→0"]
|
||||||
direction TB
|
direction TB
|
||||||
LD_CLEAN["[12] LoopDetection<br/>清理 pending warning"] --> MEM["[9] Memory<br/>入队记忆"] --> SBR["[2] Sandbox<br/>释放沙箱"]
|
SBR["[2] Sandbox<br/>释放沙箱"] --> MEM["[9] Memory<br/>入队记忆"]
|
||||||
end
|
end
|
||||||
|
|
||||||
TI --> LD_CLEAN
|
DTC --> SBR
|
||||||
SBR --> END(["response"])
|
MEM --> END(["response"])
|
||||||
|
|
||||||
classDef beforeNode fill:#a0a8b5,stroke:#636b7a,color:#2d3239
|
classDef beforeNode fill:#a0a8b5,stroke:#636b7a,color:#2d3239
|
||||||
classDef modelNode fill:#b5a8a0,stroke:#7a6b63,color:#2d3239
|
classDef modelNode fill:#b5a8a0,stroke:#7a6b63,color:#2d3239
|
||||||
classDef wrapModelNode fill:#a8a0b5,stroke:#6b637a,color:#2d3239
|
|
||||||
classDef afterModelNode fill:#b5a0a8,stroke:#7a636b,color:#2d3239
|
classDef afterModelNode fill:#b5a0a8,stroke:#7a636b,color:#2d3239
|
||||||
classDef afterAgentNode fill:#a0b5a8,stroke:#637a6b,color:#2d3239
|
classDef afterAgentNode fill:#a0b5a8,stroke:#637a6b,color:#2d3239
|
||||||
classDef terminalNode fill:#a8b5a0,stroke:#6b7a63,color:#2d3239
|
classDef terminalNode fill:#a8b5a0,stroke:#6b7a63,color:#2d3239
|
||||||
|
|
||||||
class TD,UL,SB,LD_BA,VI beforeNode
|
class TD,UL,SB,VI beforeNode
|
||||||
class DTC_WM,LD_WM wrapModelNode
|
|
||||||
class M modelNode
|
class M modelNode
|
||||||
class LD,SL,TI afterModelNode
|
class CL,LD,SL,TI,SM,DTC afterModelNode
|
||||||
class LD_CLEAN,SBR,MEM afterAgentNode
|
class SBR,MEM afterAgentNode
|
||||||
class START,END terminalNode
|
class START,END terminalNode
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -90,12 +82,13 @@ sequenceDiagram
|
|||||||
participant TD as ThreadDataMiddleware
|
participant TD as ThreadDataMiddleware
|
||||||
participant UL as UploadsMiddleware
|
participant UL as UploadsMiddleware
|
||||||
participant SB as SandboxMiddleware
|
participant SB as SandboxMiddleware
|
||||||
participant LD as LoopDetectionMiddleware
|
|
||||||
participant VI as ViewImageMiddleware
|
participant VI as ViewImageMiddleware
|
||||||
participant DTC as DanglingToolCallMiddleware
|
|
||||||
participant M as MODEL
|
participant M as MODEL
|
||||||
|
participant CL as ClarificationMiddleware
|
||||||
participant SL as SubagentLimitMiddleware
|
participant SL as SubagentLimitMiddleware
|
||||||
participant TI as TitleMiddleware
|
participant TI as TitleMiddleware
|
||||||
|
participant SM as SummarizationMiddleware
|
||||||
|
participant DTC as DanglingToolCallMiddleware
|
||||||
participant MEM as MemoryMiddleware
|
participant MEM as MemoryMiddleware
|
||||||
|
|
||||||
U ->> TD: invoke
|
U ->> TD: invoke
|
||||||
@@ -110,26 +103,19 @@ sequenceDiagram
|
|||||||
activate SB
|
activate SB
|
||||||
Note right of SB: before_agent 获取沙箱
|
Note right of SB: before_agent 获取沙箱
|
||||||
|
|
||||||
SB ->> LD: before_agent
|
SB ->> VI: before_model
|
||||||
activate LD
|
|
||||||
Note right of LD: before_agent 清理同 thread 旧 run 的 pending warning
|
|
||||||
LD ->> VI: before_model
|
|
||||||
activate VI
|
activate VI
|
||||||
Note right of VI: before_model 注入图片 base64
|
Note right of VI: before_model 注入图片 base64
|
||||||
|
|
||||||
VI ->> DTC: wrap_model_call
|
VI ->> M: messages + tools
|
||||||
activate DTC
|
|
||||||
Note right of DTC: wrap_model_call 补悬空 ToolMessage
|
|
||||||
DTC ->> LD: wrap_model_call
|
|
||||||
Note right of LD: wrap_model_call drain 当前 run warning 并追加到末尾
|
|
||||||
LD ->> M: messages + tools
|
|
||||||
activate M
|
activate M
|
||||||
M -->> LD: AI response
|
M -->> CL: AI response
|
||||||
deactivate M
|
deactivate M
|
||||||
|
|
||||||
Note right of LD: after_model 检测循环;warning 入队,hard-stop 清 tool_calls
|
activate CL
|
||||||
LD -->> SL: after_model
|
Note right of CL: after_model 拦截 ask_clarification
|
||||||
deactivate LD
|
CL -->> SL: after_model
|
||||||
|
deactivate CL
|
||||||
|
|
||||||
activate SL
|
activate SL
|
||||||
Note right of SL: after_model 截断多余 task
|
Note right of SL: after_model 截断多余 task
|
||||||
@@ -138,18 +124,22 @@ sequenceDiagram
|
|||||||
|
|
||||||
activate TI
|
activate TI
|
||||||
Note right of TI: after_model 生成标题
|
Note right of TI: after_model 生成标题
|
||||||
TI -->> DTC: done
|
TI -->> SM: after_model
|
||||||
deactivate TI
|
deactivate TI
|
||||||
|
|
||||||
|
activate SM
|
||||||
|
Note right of SM: after_model 上下文压缩
|
||||||
|
SM -->> DTC: after_model
|
||||||
|
deactivate SM
|
||||||
|
|
||||||
|
activate DTC
|
||||||
|
Note right of DTC: after_model 补缺失 ToolMessage
|
||||||
|
DTC -->> VI: done
|
||||||
deactivate DTC
|
deactivate DTC
|
||||||
|
|
||||||
VI -->> SB: done
|
VI -->> SB: done
|
||||||
deactivate VI
|
deactivate VI
|
||||||
|
|
||||||
Note right of LD: after_agent 清理当前 run 未消费 warning
|
|
||||||
|
|
||||||
Note right of MEM: after_agent 入队记忆
|
|
||||||
|
|
||||||
Note right of SB: after_agent 释放沙箱
|
Note right of SB: after_agent 释放沙箱
|
||||||
SB -->> UL: done
|
SB -->> UL: done
|
||||||
deactivate SB
|
deactivate SB
|
||||||
@@ -157,6 +147,8 @@ sequenceDiagram
|
|||||||
UL -->> TD: done
|
UL -->> TD: done
|
||||||
deactivate UL
|
deactivate UL
|
||||||
|
|
||||||
|
Note right of MEM: after_agent 入队记忆
|
||||||
|
|
||||||
TD -->> U: response
|
TD -->> U: response
|
||||||
deactivate TD
|
deactivate TD
|
||||||
```
|
```
|
||||||
@@ -232,12 +224,12 @@ sequenceDiagram
|
|||||||
participant TD as ThreadData
|
participant TD as ThreadData
|
||||||
participant UL as Uploads
|
participant UL as Uploads
|
||||||
participant SB as Sandbox
|
participant SB as Sandbox
|
||||||
participant LD as LoopDetection
|
|
||||||
participant VI as ViewImage
|
participant VI as ViewImage
|
||||||
participant DTC as DanglingToolCall
|
|
||||||
participant M as MODEL
|
participant M as MODEL
|
||||||
|
participant CL as Clarification
|
||||||
participant SL as SubagentLimit
|
participant SL as SubagentLimit
|
||||||
participant TI as Title
|
participant TI as Title
|
||||||
|
participant SM as Summarization
|
||||||
participant MEM as Memory
|
participant MEM as Memory
|
||||||
|
|
||||||
U ->> TD: invoke
|
U ->> TD: invoke
|
||||||
@@ -246,40 +238,34 @@ sequenceDiagram
|
|||||||
Note right of UL: before_agent 扫描文件
|
Note right of UL: before_agent 扫描文件
|
||||||
UL ->> SB: .
|
UL ->> SB: .
|
||||||
Note right of SB: before_agent 获取沙箱
|
Note right of SB: before_agent 获取沙箱
|
||||||
SB ->> LD: .
|
|
||||||
Note right of LD: before_agent 清理 stale pending warning
|
|
||||||
|
|
||||||
loop 每轮对话(tool call 循环)
|
loop 每轮对话(tool call 循环)
|
||||||
SB ->> VI: .
|
SB ->> VI: .
|
||||||
Note right of VI: before_model 注入图片
|
Note right of VI: before_model 注入图片
|
||||||
VI ->> DTC: .
|
VI ->> M: messages + tools
|
||||||
Note right of DTC: wrap_model_call 补悬空工具结果
|
M -->> CL: AI response
|
||||||
DTC ->> LD: .
|
Note right of CL: after_model 拦截 ask_clarification
|
||||||
Note right of LD: wrap_model_call 注入当前 run warning
|
CL -->> SL: .
|
||||||
LD ->> M: messages + tools
|
|
||||||
M -->> LD: AI response
|
|
||||||
Note right of LD: after_model 检测循环/排队 warning
|
|
||||||
LD -->> SL: .
|
|
||||||
Note right of SL: after_model 截断多余 task
|
Note right of SL: after_model 截断多余 task
|
||||||
SL -->> TI: .
|
SL -->> TI: .
|
||||||
Note right of TI: after_model 生成标题
|
Note right of TI: after_model 生成标题
|
||||||
|
TI -->> SM: .
|
||||||
|
Note right of SM: after_model 上下文压缩
|
||||||
end
|
end
|
||||||
|
|
||||||
Note right of LD: after_agent 清理当前 run pending warning
|
|
||||||
LD -->> MEM: .
|
|
||||||
Note right of MEM: after_agent 入队记忆
|
|
||||||
MEM -->> SB: .
|
|
||||||
Note right of SB: after_agent 释放沙箱
|
Note right of SB: after_agent 释放沙箱
|
||||||
SB -->> U: response
|
SB -->> MEM: .
|
||||||
|
Note right of MEM: after_agent 入队记忆
|
||||||
|
MEM -->> U: response
|
||||||
```
|
```
|
||||||
|
|
||||||
> [!warning] 不是洋葱
|
> [!warning] 不是洋葱
|
||||||
> 大部分 middleware 只用一个阶段。SandboxMiddleware 使用 `before_agent`/`after_agent` 做资源获取/释放;LoopDetectionMiddleware 也使用这两个钩子,但用途是清理 run-scoped pending warnings,不是资源生命周期对称。`before_agent` / `after_agent` 只跑一次,`before_model` / `after_model` / `wrap_model_call` 每轮循环都跑。
|
> 14 个 middleware 中只有 SandboxMiddleware 有 before/after 对称(获取/释放)。其余都是单向的:要么只在 `before_*` 做事,要么只在 `after_*` 做事。`before_agent` / `after_agent` 只跑一次,`before_model` / `after_model` 每轮循环都跑。
|
||||||
|
|
||||||
硬依赖只有 2 处:
|
硬依赖只有 2 处:
|
||||||
|
|
||||||
1. **ThreadData 在 Sandbox 之前** — sandbox 需要线程目录
|
1. **ThreadData 在 Sandbox 之前** — sandbox 需要线程目录
|
||||||
2. **Clarification 在列表最后** — `wrap_tool_call` 处理 `ask_clarification` 时优先拦截,并通过 `Command(goto=END)` 中断执行
|
2. **Clarification 在列表最后** — `after_model` 反序时最先执行,第一个拦截 `ask_clarification`
|
||||||
|
|
||||||
### 结论
|
### 结论
|
||||||
|
|
||||||
@@ -287,19 +273,19 @@ sequenceDiagram
|
|||||||
|---|---|---|
|
|---|---|---|
|
||||||
| 每个 middleware | before + after 对称 | 大多只用一个钩子 |
|
| 每个 middleware | before + after 对称 | 大多只用一个钩子 |
|
||||||
| 激活条 | 嵌套(外长内短) | 不嵌套(串行) |
|
| 激活条 | 嵌套(外长内短) | 不嵌套(串行) |
|
||||||
| 反序的意义 | 清理与初始化配对 | 影响 `after_model` / `after_agent` 的执行优先级 |
|
| 反序的意义 | 清理与初始化配对 | 仅影响 after_model 的执行优先级 |
|
||||||
| 典型例子 | Auth: 校验 token / 清理上下文 | ThreadData: 只创建目录,没有清理 |
|
| 典型例子 | Auth: 校验 token / 清理上下文 | ThreadData: 只创建目录,没有清理 |
|
||||||
|
|
||||||
## 关键设计点
|
## 关键设计点
|
||||||
|
|
||||||
### ClarificationMiddleware 为什么在列表最后?
|
### ClarificationMiddleware 为什么在列表最后?
|
||||||
|
|
||||||
位置最后使它在工具调用包装链中优先拦截 `ask_clarification`。如果命中,它返回 `Command(goto=END)`,把格式化后的澄清问题写成 `ToolMessage` 并中断执行。
|
位置最后 = `after_model` 最先执行。它需要**第一个**看到 model 输出,检查是否有 `ask_clarification` tool call。如果有,立即中断(`Command(goto=END)`),后续 middleware 的 `after_model` 不再执行。
|
||||||
|
|
||||||
### SandboxMiddleware 的对称性
|
### SandboxMiddleware 的对称性
|
||||||
|
|
||||||
`before_agent`(正序第 3 个)获取沙箱,`after_agent`(反序第 1 个)释放沙箱。外层进入 → 外层退出,天然的洋葱对称。
|
`before_agent`(正序第 3 个)获取沙箱,`after_agent`(反序第 1 个)释放沙箱。外层进入 → 外层退出,天然的洋葱对称。
|
||||||
|
|
||||||
### LoopDetectionMiddleware 为什么同时用多个钩子?
|
### 大部分 middleware 只用一个钩子
|
||||||
|
|
||||||
`after_model` 只做检测:重复工具调用达到 warning 阈值时,把 warning 放入 `(thread_id, run_id)` 作用域的 pending 队列。真正注入发生在下一次 `wrap_model_call`:此时上一轮 `AIMessage(tool_calls)` 对应的 `ToolMessage` 已经在请求里,warning 追加在末尾,不会破坏 OpenAI/Moonshot 的 tool-call pairing。`before_agent` 清理同一 thread 下旧 run 的残留 warning,`after_agent` 清理当前 run 没被消费的 warning。
|
14 个 middleware 中,只有 SandboxMiddleware 同时用了 `before_agent` + `after_agent`(获取/释放)。其余都只在一个阶段执行。洋葱模型的反序特性主要影响 `after_model` 阶段的执行顺序。
|
||||||
|
|||||||
@@ -1,23 +1,3 @@
|
|||||||
"""Lead agent factory.
|
|
||||||
|
|
||||||
INVARIANT — tracing callback placement
|
|
||||||
======================================
|
|
||||||
|
|
||||||
Tracing callbacks (Langfuse, LangSmith) are attached at the **graph
|
|
||||||
invocation root** in :func:`_make_lead_agent` (see the
|
|
||||||
``build_tracing_callbacks()`` block that appends to ``config["callbacks"]``).
|
|
||||||
Every ``create_chat_model(...)`` call inside this module — and inside any
|
|
||||||
middleware reachable from this graph (e.g. ``TitleMiddleware``) — MUST pass
|
|
||||||
``attach_tracing=False``.
|
|
||||||
|
|
||||||
Forgetting that flag emits duplicate spans (one rooted at the graph, one at
|
|
||||||
the model) AND prevents the Langfuse handler's ``propagate_attributes``
|
|
||||||
path from firing, so ``session_id`` / ``user_id`` never reach the trace.
|
|
||||||
The four current sites are: bootstrap agent, default agent, summarization
|
|
||||||
middleware, and the async path inside ``TitleMiddleware``. Any new in-graph
|
|
||||||
``create_chat_model`` call must add to this list and pass the flag.
|
|
||||||
"""
|
|
||||||
|
|
||||||
import logging
|
import logging
|
||||||
|
|
||||||
from langchain.agents import create_agent
|
from langchain.agents import create_agent
|
||||||
@@ -29,7 +9,6 @@ from deerflow.agents.memory.summarization_hook import memory_flush_hook
|
|||||||
from deerflow.agents.middlewares.clarification_middleware import ClarificationMiddleware
|
from deerflow.agents.middlewares.clarification_middleware import ClarificationMiddleware
|
||||||
from deerflow.agents.middlewares.loop_detection_middleware import LoopDetectionMiddleware
|
from deerflow.agents.middlewares.loop_detection_middleware import LoopDetectionMiddleware
|
||||||
from deerflow.agents.middlewares.memory_middleware import MemoryMiddleware
|
from deerflow.agents.middlewares.memory_middleware import MemoryMiddleware
|
||||||
from deerflow.agents.middlewares.safety_finish_reason_middleware import SafetyFinishReasonMiddleware
|
|
||||||
from deerflow.agents.middlewares.subagent_limit_middleware import SubagentLimitMiddleware
|
from deerflow.agents.middlewares.subagent_limit_middleware import SubagentLimitMiddleware
|
||||||
from deerflow.agents.middlewares.summarization_middleware import BeforeSummarizationHook, DeerFlowSummarizationMiddleware
|
from deerflow.agents.middlewares.summarization_middleware import BeforeSummarizationHook, DeerFlowSummarizationMiddleware
|
||||||
from deerflow.agents.middlewares.title_middleware import TitleMiddleware
|
from deerflow.agents.middlewares.title_middleware import TitleMiddleware
|
||||||
@@ -43,7 +22,6 @@ from deerflow.config.app_config import AppConfig, get_app_config
|
|||||||
from deerflow.models import create_chat_model
|
from deerflow.models import create_chat_model
|
||||||
from deerflow.skills.tool_policy import filter_tools_by_skill_allowed_tools
|
from deerflow.skills.tool_policy import filter_tools_by_skill_allowed_tools
|
||||||
from deerflow.skills.types import Skill
|
from deerflow.skills.types import Skill
|
||||||
from deerflow.tracing import build_tracing_callbacks
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
@@ -95,14 +73,10 @@ def _create_summarization_middleware(*, app_config: AppConfig | None = None) ->
|
|||||||
# Bind "middleware:summarize" tag so RunJournal identifies these LLM calls
|
# Bind "middleware:summarize" tag so RunJournal identifies these LLM calls
|
||||||
# as middleware rather than lead_agent (SummarizationMiddleware is a
|
# as middleware rather than lead_agent (SummarizationMiddleware is a
|
||||||
# LangChain built-in, so we tag the model at creation time).
|
# LangChain built-in, so we tag the model at creation time).
|
||||||
# attach_tracing=False because the graph-level RunnableConfig (set in
|
|
||||||
# ``_make_lead_agent``) already carries tracing callbacks; binding them
|
|
||||||
# again at the model level would emit duplicate spans and break
|
|
||||||
# ``session_id`` / ``user_id`` propagation.
|
|
||||||
if config.model_name:
|
if config.model_name:
|
||||||
model = create_chat_model(name=config.model_name, thinking_enabled=False, app_config=resolved_app_config, attach_tracing=False)
|
model = create_chat_model(name=config.model_name, thinking_enabled=False, app_config=resolved_app_config)
|
||||||
else:
|
else:
|
||||||
model = create_chat_model(thinking_enabled=False, app_config=resolved_app_config, attach_tracing=False)
|
model = create_chat_model(thinking_enabled=False, app_config=resolved_app_config)
|
||||||
model = model.with_config(tags=["middleware:summarize"])
|
model = model.with_config(tags=["middleware:summarize"])
|
||||||
|
|
||||||
# Prepare kwargs
|
# Prepare kwargs
|
||||||
@@ -339,15 +313,6 @@ def _build_middlewares(
|
|||||||
if custom_middlewares:
|
if custom_middlewares:
|
||||||
middlewares.extend(custom_middlewares)
|
middlewares.extend(custom_middlewares)
|
||||||
|
|
||||||
# SafetyFinishReasonMiddleware — suppress tool execution when the provider
|
|
||||||
# safety-terminated the response. Registered after custom middlewares so
|
|
||||||
# that LangChain's reverse-order after_model dispatch runs Safety first;
|
|
||||||
# cleared tool_calls then flow through Loop/Subagent accounting without
|
|
||||||
# firing extra alarms. See safety_finish_reason_middleware.py docstring.
|
|
||||||
safety_config = resolved_app_config.safety_finish_reason
|
|
||||||
if safety_config.enabled:
|
|
||||||
middlewares.append(SafetyFinishReasonMiddleware.from_config(safety_config))
|
|
||||||
|
|
||||||
# ClarificationMiddleware should always be last
|
# ClarificationMiddleware should always be last
|
||||||
middlewares.append(ClarificationMiddleware())
|
middlewares.append(ClarificationMiddleware())
|
||||||
return middlewares
|
return middlewares
|
||||||
@@ -443,26 +408,13 @@ def _make_lead_agent(config: RunnableConfig, *, app_config: AppConfig):
|
|||||||
}
|
}
|
||||||
)
|
)
|
||||||
|
|
||||||
# Inject tracing callbacks at the graph invocation root so a single LangGraph
|
|
||||||
# run produces one trace with all node / LLM / tool calls as child spans,
|
|
||||||
# AND so the Langfuse handler sees ``on_chain_start(parent_run_id=None)`` and
|
|
||||||
# actually propagates ``langfuse_session_id`` / ``langfuse_user_id`` from
|
|
||||||
# ``config["metadata"]`` onto the trace. Without root-level attachment the
|
|
||||||
# model is a nested observation and the handler strips ``langfuse_*`` keys.
|
|
||||||
tracing_callbacks = build_tracing_callbacks()
|
|
||||||
if tracing_callbacks:
|
|
||||||
existing = config.get("callbacks") or []
|
|
||||||
if not isinstance(existing, list):
|
|
||||||
existing = list(existing)
|
|
||||||
config["callbacks"] = [*existing, *tracing_callbacks]
|
|
||||||
|
|
||||||
skills_for_tool_policy = _load_enabled_skills_for_tool_policy(available_skills, app_config=resolved_app_config)
|
skills_for_tool_policy = _load_enabled_skills_for_tool_policy(available_skills, app_config=resolved_app_config)
|
||||||
|
|
||||||
if is_bootstrap:
|
if is_bootstrap:
|
||||||
# Special bootstrap agent with minimal prompt for initial custom agent creation flow
|
# Special bootstrap agent with minimal prompt for initial custom agent creation flow
|
||||||
tools = get_available_tools(model_name=model_name, subagent_enabled=subagent_enabled, app_config=resolved_app_config) + [setup_agent]
|
tools = get_available_tools(model_name=model_name, subagent_enabled=subagent_enabled, app_config=resolved_app_config) + [setup_agent]
|
||||||
return create_agent(
|
return create_agent(
|
||||||
model=create_chat_model(name=model_name, thinking_enabled=thinking_enabled, app_config=resolved_app_config, attach_tracing=False),
|
model=create_chat_model(name=model_name, thinking_enabled=thinking_enabled, app_config=resolved_app_config),
|
||||||
tools=filter_tools_by_skill_allowed_tools(tools, skills_for_tool_policy),
|
tools=filter_tools_by_skill_allowed_tools(tools, skills_for_tool_policy),
|
||||||
middleware=_build_middlewares(config, model_name=model_name, app_config=resolved_app_config),
|
middleware=_build_middlewares(config, model_name=model_name, app_config=resolved_app_config),
|
||||||
system_prompt=apply_prompt_template(
|
system_prompt=apply_prompt_template(
|
||||||
@@ -480,7 +432,7 @@ def _make_lead_agent(config: RunnableConfig, *, app_config: AppConfig):
|
|||||||
# Default lead agent (unchanged behavior)
|
# Default lead agent (unchanged behavior)
|
||||||
tools = get_available_tools(model_name=model_name, groups=agent_config.tool_groups if agent_config else None, subagent_enabled=subagent_enabled, app_config=resolved_app_config)
|
tools = get_available_tools(model_name=model_name, groups=agent_config.tool_groups if agent_config else None, subagent_enabled=subagent_enabled, app_config=resolved_app_config)
|
||||||
return create_agent(
|
return create_agent(
|
||||||
model=create_chat_model(name=model_name, thinking_enabled=thinking_enabled, reasoning_effort=reasoning_effort, app_config=resolved_app_config, attach_tracing=False),
|
model=create_chat_model(name=model_name, thinking_enabled=thinking_enabled, reasoning_effort=reasoning_effort, app_config=resolved_app_config),
|
||||||
tools=filter_tools_by_skill_allowed_tools(tools + extra_tools, skills_for_tool_policy),
|
tools=filter_tools_by_skill_allowed_tools(tools + extra_tools, skills_for_tool_policy),
|
||||||
middleware=_build_middlewares(config, model_name=model_name, agent_name=agent_name, app_config=resolved_app_config),
|
middleware=_build_middlewares(config, model_name=model_name, agent_name=agent_name, app_config=resolved_app_config),
|
||||||
system_prompt=apply_prompt_template(
|
system_prompt=apply_prompt_template(
|
||||||
|
|||||||
@@ -227,110 +227,6 @@ def _extract_text(content: Any) -> str:
|
|||||||
return str(content)
|
return str(content)
|
||||||
|
|
||||||
|
|
||||||
_REQUIRED_MEMORY_UPDATE_TOP_LEVEL_KEYS = frozenset({"user", "history", "newFacts", "factsToRemove"})
|
|
||||||
|
|
||||||
|
|
||||||
def _normalize_memory_update_fact(fact: Any) -> dict[str, Any] | None:
|
|
||||||
"""Normalize a single fact entry from a model-produced memory update."""
|
|
||||||
if not isinstance(fact, dict):
|
|
||||||
return None
|
|
||||||
|
|
||||||
raw_content = fact.get("content")
|
|
||||||
if not isinstance(raw_content, str):
|
|
||||||
return None
|
|
||||||
content = raw_content.strip()
|
|
||||||
if not content:
|
|
||||||
return None
|
|
||||||
|
|
||||||
raw_category = fact.get("category")
|
|
||||||
category = raw_category.strip() if isinstance(raw_category, str) and raw_category.strip() else "context"
|
|
||||||
|
|
||||||
raw_confidence = fact.get("confidence", 0.5)
|
|
||||||
if isinstance(raw_confidence, bool):
|
|
||||||
return None
|
|
||||||
if isinstance(raw_confidence, str):
|
|
||||||
raw_confidence = raw_confidence.strip()
|
|
||||||
if not raw_confidence:
|
|
||||||
return None
|
|
||||||
try:
|
|
||||||
raw_confidence = float(raw_confidence)
|
|
||||||
except ValueError:
|
|
||||||
return None
|
|
||||||
elif isinstance(raw_confidence, (int, float)):
|
|
||||||
raw_confidence = float(raw_confidence)
|
|
||||||
else:
|
|
||||||
return None
|
|
||||||
|
|
||||||
if not math.isfinite(raw_confidence):
|
|
||||||
return None
|
|
||||||
|
|
||||||
normalized_fact = {
|
|
||||||
"content": content,
|
|
||||||
"category": category,
|
|
||||||
"confidence": raw_confidence,
|
|
||||||
}
|
|
||||||
source_error = fact.get("sourceError")
|
|
||||||
if isinstance(source_error, str):
|
|
||||||
normalized_source_error = source_error.strip()
|
|
||||||
if normalized_source_error:
|
|
||||||
normalized_fact["sourceError"] = normalized_source_error
|
|
||||||
|
|
||||||
return normalized_fact
|
|
||||||
|
|
||||||
|
|
||||||
def _normalize_memory_update_data(update_data: dict[str, Any]) -> dict[str, Any]:
|
|
||||||
"""Coerce parsed memory update data into the shape consumed by _apply_updates."""
|
|
||||||
user = update_data.get("user")
|
|
||||||
history = update_data.get("history")
|
|
||||||
new_facts = update_data.get("newFacts")
|
|
||||||
facts_to_remove = update_data.get("factsToRemove")
|
|
||||||
normalized_facts_to_remove = [fact_id for fact_id in facts_to_remove if isinstance(fact_id, str)] if isinstance(facts_to_remove, list) else []
|
|
||||||
normalized_new_facts = []
|
|
||||||
dropped_new_fact = not isinstance(new_facts, list)
|
|
||||||
if isinstance(new_facts, list):
|
|
||||||
for fact in new_facts:
|
|
||||||
normalized_fact = _normalize_memory_update_fact(fact)
|
|
||||||
if normalized_fact is not None:
|
|
||||||
normalized_new_facts.append(normalized_fact)
|
|
||||||
else:
|
|
||||||
dropped_new_fact = True
|
|
||||||
|
|
||||||
if normalized_facts_to_remove and dropped_new_fact:
|
|
||||||
raise json.JSONDecodeError(
|
|
||||||
"Unsafe partial memory update: factsToRemove with malformed newFacts",
|
|
||||||
json.dumps(update_data, ensure_ascii=False),
|
|
||||||
0,
|
|
||||||
)
|
|
||||||
|
|
||||||
return {
|
|
||||||
"user": user if isinstance(user, dict) else {},
|
|
||||||
"history": history if isinstance(history, dict) else {},
|
|
||||||
"newFacts": normalized_new_facts,
|
|
||||||
"factsToRemove": normalized_facts_to_remove,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def _parse_memory_update_response(response_content: Any) -> dict[str, Any]:
|
|
||||||
"""Parse the first valid memory-update JSON object from an LLM response.
|
|
||||||
|
|
||||||
Some providers may wrap JSON in thinking traces, prose, or markdown fences
|
|
||||||
even when prompted to return JSON only. This parser accepts safely
|
|
||||||
extractable JSON objects but does not repair truncated or malformed JSON.
|
|
||||||
"""
|
|
||||||
response_text = _extract_text(response_content).strip()
|
|
||||||
decoder = json.JSONDecoder()
|
|
||||||
|
|
||||||
for match in re.finditer(r"\{", response_text):
|
|
||||||
try:
|
|
||||||
parsed, _end = decoder.raw_decode(response_text[match.start() :])
|
|
||||||
except json.JSONDecodeError:
|
|
||||||
continue
|
|
||||||
if isinstance(parsed, dict) and _REQUIRED_MEMORY_UPDATE_TOP_LEVEL_KEYS.issubset(parsed):
|
|
||||||
return _normalize_memory_update_data(parsed)
|
|
||||||
|
|
||||||
raise json.JSONDecodeError("No valid memory update JSON object found", response_text, 0)
|
|
||||||
|
|
||||||
|
|
||||||
# Matches sentences that describe a file-upload *event* rather than general
|
# Matches sentences that describe a file-upload *event* rather than general
|
||||||
# file-related work. Deliberately narrow to avoid removing legitimate facts
|
# file-related work. Deliberately narrow to avoid removing legitimate facts
|
||||||
# such as "User works with CSV files" or "prefers PDF export".
|
# such as "User works with CSV files" or "prefers PDF export".
|
||||||
@@ -442,7 +338,7 @@ class MemoryUpdater:
|
|||||||
reinforcement_detected=reinforcement_detected,
|
reinforcement_detected=reinforcement_detected,
|
||||||
)
|
)
|
||||||
prompt = MEMORY_UPDATE_PROMPT.format(
|
prompt = MEMORY_UPDATE_PROMPT.format(
|
||||||
current_memory=json.dumps(current_memory, indent=2, ensure_ascii=False),
|
current_memory=json.dumps(current_memory, indent=2),
|
||||||
conversation=conversation_text,
|
conversation=conversation_text,
|
||||||
correction_hint=correction_hint,
|
correction_hint=correction_hint,
|
||||||
)
|
)
|
||||||
@@ -457,7 +353,13 @@ class MemoryUpdater:
|
|||||||
user_id: str | None = None,
|
user_id: str | None = None,
|
||||||
) -> bool:
|
) -> bool:
|
||||||
"""Parse the model response, apply updates, and persist memory."""
|
"""Parse the model response, apply updates, and persist memory."""
|
||||||
update_data = _parse_memory_update_response(response_content)
|
response_text = _extract_text(response_content).strip()
|
||||||
|
|
||||||
|
if response_text.startswith("```"):
|
||||||
|
lines = response_text.split("\n")
|
||||||
|
response_text = "\n".join(lines[1:-1] if lines[-1] == "```" else lines[1:])
|
||||||
|
|
||||||
|
update_data = json.loads(response_text)
|
||||||
# Deep-copy before in-place mutation so a subsequent save() failure
|
# Deep-copy before in-place mutation so a subsequent save() failure
|
||||||
# cannot corrupt the still-cached original object reference.
|
# cannot corrupt the still-cached original object reference.
|
||||||
updated_memory = self._apply_updates(copy.deepcopy(current_memory), update_data, thread_id)
|
updated_memory = self._apply_updates(copy.deepcopy(current_memory), update_data, thread_id)
|
||||||
|
|||||||
+24
-49
@@ -15,7 +15,6 @@ to the end of the message list as before_model + add_messages reducer would do.
|
|||||||
|
|
||||||
import json
|
import json
|
||||||
import logging
|
import logging
|
||||||
from collections import defaultdict, deque
|
|
||||||
from collections.abc import Awaitable, Callable
|
from collections.abc import Awaitable, Callable
|
||||||
from typing import override
|
from typing import override
|
||||||
|
|
||||||
@@ -26,11 +25,6 @@ from langchain_core.messages import ToolMessage
|
|||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Workaround for issue #2894: malformed write_file calls can carry huge Markdown
|
|
||||||
# payloads in invalid tool-call args. Keep recovery error details short so the
|
|
||||||
# synthetic ToolMessage does not echo large or malformed content back to the model.
|
|
||||||
_MAX_RECOVERY_ERROR_DETAIL_LEN = 500
|
|
||||||
|
|
||||||
|
|
||||||
class DanglingToolCallMiddleware(AgentMiddleware[AgentState]):
|
class DanglingToolCallMiddleware(AgentMiddleware[AgentState]):
|
||||||
"""Inserts placeholder ToolMessages for dangling tool calls before model invocation.
|
"""Inserts placeholder ToolMessages for dangling tool calls before model invocation.
|
||||||
@@ -103,68 +97,52 @@ class DanglingToolCallMiddleware(AgentMiddleware[AgentState]):
|
|||||||
@staticmethod
|
@staticmethod
|
||||||
def _synthetic_tool_message_content(tool_call: dict) -> str:
|
def _synthetic_tool_message_content(tool_call: dict) -> str:
|
||||||
if tool_call.get("invalid"):
|
if tool_call.get("invalid"):
|
||||||
name = tool_call.get("name")
|
|
||||||
error = tool_call.get("error")
|
error = tool_call.get("error")
|
||||||
error_text = error[:_MAX_RECOVERY_ERROR_DETAIL_LEN] if isinstance(error, str) and error else ""
|
if isinstance(error, str) and error:
|
||||||
# Workaround for issue #2894: malformed write_file calls can carry huge Markdown
|
return f"[Tool call could not be executed because its arguments were invalid: {error}]"
|
||||||
# payloads in invalid tool-call args. Keep recovery guidance actionable without
|
|
||||||
# echoing large or malformed content back to the model.
|
|
||||||
if name == "write_file":
|
|
||||||
details = f" Parser error: {error_text}" if error_text else ""
|
|
||||||
return (
|
|
||||||
"[write_file failed before execution: the tool-call arguments were not valid JSON, "
|
|
||||||
"so no file was written. This often happens when the model tries to write a very "
|
|
||||||
"large Markdown file in a single tool call, especially when `content` contains "
|
|
||||||
"unescaped quotes, inline JSON, backslashes, or code fences. Do not retry the same "
|
|
||||||
"large `write_file` payload for this artifact; provide the report/content directly "
|
|
||||||
"as normal assistant text in your next response. If a file write is still needed "
|
|
||||||
f"later, split the file into smaller sections instead of one large payload.{details}]"
|
|
||||||
)
|
|
||||||
if error_text:
|
|
||||||
return f"[Tool call could not be executed because its arguments were invalid: {error_text}]"
|
|
||||||
return "[Tool call could not be executed because its arguments were invalid.]"
|
return "[Tool call could not be executed because its arguments were invalid.]"
|
||||||
return "[Tool call was interrupted and did not return a result.]"
|
return "[Tool call was interrupted and did not return a result.]"
|
||||||
|
|
||||||
def _build_patched_messages(self, messages: list) -> list | None:
|
def _build_patched_messages(self, messages: list) -> list | None:
|
||||||
"""Return messages with tool results grouped after their tool-call AIMessage.
|
"""Return a new message list with patches inserted at the correct positions.
|
||||||
|
|
||||||
This normalizes model-bound causal order before provider serialization while
|
For each AIMessage with dangling tool_calls (no corresponding ToolMessage),
|
||||||
preserving already-valid transcripts unchanged.
|
a synthetic ToolMessage is inserted immediately after that AIMessage.
|
||||||
|
Returns None if no patches are needed.
|
||||||
"""
|
"""
|
||||||
tool_messages_by_id: dict[str, deque[ToolMessage]] = defaultdict(deque)
|
# Collect IDs of all existing ToolMessages
|
||||||
|
existing_tool_msg_ids: set[str] = set()
|
||||||
for msg in messages:
|
for msg in messages:
|
||||||
if isinstance(msg, ToolMessage):
|
if isinstance(msg, ToolMessage):
|
||||||
tool_messages_by_id[msg.tool_call_id].append(msg)
|
existing_tool_msg_ids.add(msg.tool_call_id)
|
||||||
|
|
||||||
tool_call_ids: set[str] = set()
|
# Check if any patching is needed
|
||||||
|
needs_patch = False
|
||||||
for msg in messages:
|
for msg in messages:
|
||||||
if getattr(msg, "type", None) != "ai":
|
if getattr(msg, "type", None) != "ai":
|
||||||
continue
|
continue
|
||||||
for tc in self._message_tool_calls(msg):
|
for tc in self._message_tool_calls(msg):
|
||||||
tc_id = tc.get("id")
|
tc_id = tc.get("id")
|
||||||
if tc_id:
|
if tc_id and tc_id not in existing_tool_msg_ids:
|
||||||
tool_call_ids.add(tc_id)
|
needs_patch = True
|
||||||
|
break
|
||||||
|
if needs_patch:
|
||||||
|
break
|
||||||
|
|
||||||
|
if not needs_patch:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Build new list with patches inserted right after each dangling AIMessage
|
||||||
patched: list = []
|
patched: list = []
|
||||||
|
patched_ids: set[str] = set()
|
||||||
patch_count = 0
|
patch_count = 0
|
||||||
for msg in messages:
|
for msg in messages:
|
||||||
if isinstance(msg, ToolMessage) and msg.tool_call_id in tool_call_ids:
|
|
||||||
continue
|
|
||||||
|
|
||||||
patched.append(msg)
|
patched.append(msg)
|
||||||
if getattr(msg, "type", None) != "ai":
|
if getattr(msg, "type", None) != "ai":
|
||||||
continue
|
continue
|
||||||
|
|
||||||
for tc in self._message_tool_calls(msg):
|
for tc in self._message_tool_calls(msg):
|
||||||
tc_id = tc.get("id")
|
tc_id = tc.get("id")
|
||||||
if not tc_id:
|
if tc_id and tc_id not in existing_tool_msg_ids and tc_id not in patched_ids:
|
||||||
continue
|
|
||||||
|
|
||||||
tool_msg_queue = tool_messages_by_id.get(tc_id)
|
|
||||||
existing_tool_msg = tool_msg_queue.popleft() if tool_msg_queue else None
|
|
||||||
if existing_tool_msg is not None:
|
|
||||||
patched.append(existing_tool_msg)
|
|
||||||
else:
|
|
||||||
patched.append(
|
patched.append(
|
||||||
ToolMessage(
|
ToolMessage(
|
||||||
content=self._synthetic_tool_message_content(tc),
|
content=self._synthetic_tool_message_content(tc),
|
||||||
@@ -173,13 +151,10 @@ class DanglingToolCallMiddleware(AgentMiddleware[AgentState]):
|
|||||||
status="error",
|
status="error",
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
patched_ids.add(tc_id)
|
||||||
patch_count += 1
|
patch_count += 1
|
||||||
|
|
||||||
if patched == messages:
|
logger.warning(f"Injecting {patch_count} placeholder ToolMessage(s) for dangling tool calls")
|
||||||
return None
|
|
||||||
|
|
||||||
if patch_count:
|
|
||||||
logger.warning(f"Injecting {patch_count} placeholder ToolMessage(s) for dangling tool calls")
|
|
||||||
return patched
|
return patched
|
||||||
|
|
||||||
@override
|
@override
|
||||||
|
|||||||
+28
-201
@@ -6,36 +6,10 @@ arguments indefinitely until the recursion limit kills the run.
|
|||||||
Detection strategy:
|
Detection strategy:
|
||||||
1. After each model response, hash the tool calls (name + args).
|
1. After each model response, hash the tool calls (name + args).
|
||||||
2. Track recent hashes in a sliding window.
|
2. Track recent hashes in a sliding window.
|
||||||
3. If the same hash appears >= warn_threshold times, queue a
|
3. If the same hash appears >= warn_threshold times, inject a
|
||||||
"you are repeating yourself — wrap up" warning for the current
|
"you are repeating yourself — wrap up" system message (once per hash).
|
||||||
thread/run. The warning is **injected at the next model call** (in
|
|
||||||
``wrap_model_call``) as a ``HumanMessage`` appended to the message
|
|
||||||
list, *after* all ToolMessage responses to the previous
|
|
||||||
AIMessage(tool_calls).
|
|
||||||
4. If it appears >= hard_limit times, strip all tool_calls from the
|
4. If it appears >= hard_limit times, strip all tool_calls from the
|
||||||
response so the agent is forced to produce a final text answer.
|
response so the agent is forced to produce a final text answer.
|
||||||
|
|
||||||
Why the warning is injected at ``wrap_model_call`` instead of
|
|
||||||
``after_model``:
|
|
||||||
|
|
||||||
``after_model`` fires immediately after the model emits an
|
|
||||||
``AIMessage`` that may carry ``tool_calls``. The tools node has not
|
|
||||||
run yet, so no matching ``ToolMessage`` exists in the history. Any
|
|
||||||
message we add here lands *between* the assistant's tool_calls and
|
|
||||||
their responses. OpenAI/Moonshot reject the next request with
|
|
||||||
``"tool_call_ids did not have response messages"`` because their
|
|
||||||
validators require the assistant's tool_calls to be followed
|
|
||||||
immediately by tool messages. Anthropic also disallows mid-stream
|
|
||||||
``SystemMessage``. By deferring the warning to ``wrap_model_call``,
|
|
||||||
every prior ToolMessage is already present in the request's message
|
|
||||||
list and the warning is appended at the end — pairing intact, no
|
|
||||||
``AIMessage`` semantics are mutated.
|
|
||||||
|
|
||||||
Queued warnings are intentionally transient. If a run ends before the
|
|
||||||
next model request drains a queued warning, ``after_agent`` drops it
|
|
||||||
instead of carrying it into a later invocation for the same thread. The
|
|
||||||
hard-stop path still forces termination when the configured safety limit
|
|
||||||
is reached.
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
@@ -45,14 +19,11 @@ import json
|
|||||||
import logging
|
import logging
|
||||||
import threading
|
import threading
|
||||||
from collections import OrderedDict, defaultdict
|
from collections import OrderedDict, defaultdict
|
||||||
from collections.abc import Awaitable, Callable
|
|
||||||
from copy import deepcopy
|
from copy import deepcopy
|
||||||
from typing import TYPE_CHECKING, override
|
from typing import TYPE_CHECKING, override
|
||||||
|
|
||||||
from langchain.agents import AgentState
|
from langchain.agents import AgentState
|
||||||
from langchain.agents.middleware import AgentMiddleware
|
from langchain.agents.middleware import AgentMiddleware
|
||||||
from langchain.agents.middleware.types import ModelCallResult, ModelRequest, ModelResponse
|
|
||||||
from langchain_core.messages import HumanMessage
|
|
||||||
from langgraph.runtime import Runtime
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
if TYPE_CHECKING:
|
||||||
@@ -67,7 +38,6 @@ _DEFAULT_WINDOW_SIZE = 20 # track last N tool calls
|
|||||||
_DEFAULT_MAX_TRACKED_THREADS = 100 # LRU eviction limit
|
_DEFAULT_MAX_TRACKED_THREADS = 100 # LRU eviction limit
|
||||||
_DEFAULT_TOOL_FREQ_WARN = 30 # warn after 30 calls to the same tool type
|
_DEFAULT_TOOL_FREQ_WARN = 30 # warn after 30 calls to the same tool type
|
||||||
_DEFAULT_TOOL_FREQ_HARD_LIMIT = 50 # force-stop after 50 calls to the same tool type
|
_DEFAULT_TOOL_FREQ_HARD_LIMIT = 50 # force-stop after 50 calls to the same tool type
|
||||||
_MAX_PENDING_WARNINGS_PER_RUN = 4
|
|
||||||
|
|
||||||
|
|
||||||
def _normalize_tool_call_args(raw_args: object) -> tuple[dict, str | None]:
|
def _normalize_tool_call_args(raw_args: object) -> tuple[dict, str | None]:
|
||||||
@@ -225,12 +195,6 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
self._warned: dict[str, set[str]] = defaultdict(set)
|
self._warned: dict[str, set[str]] = defaultdict(set)
|
||||||
self._tool_freq: dict[str, dict[str, int]] = defaultdict(lambda: defaultdict(int))
|
self._tool_freq: dict[str, dict[str, int]] = defaultdict(lambda: defaultdict(int))
|
||||||
self._tool_freq_warned: dict[str, set[str]] = defaultdict(set)
|
self._tool_freq_warned: dict[str, set[str]] = defaultdict(set)
|
||||||
# Per-thread/run queue of warnings to inject at the next model call.
|
|
||||||
# Populated by ``after_model`` (detection) and drained by
|
|
||||||
# ``wrap_model_call`` (injection); see module docstring.
|
|
||||||
self._pending_warnings: dict[tuple[str, str], list[str]] = defaultdict(list)
|
|
||||||
self._pending_warning_touch_order: OrderedDict[tuple[str, str], None] = OrderedDict()
|
|
||||||
self._max_pending_warning_keys = max(1, self.max_tracked_threads * 2)
|
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def from_config(cls, config: LoopDetectionConfig) -> LoopDetectionMiddleware:
|
def from_config(cls, config: LoopDetectionConfig) -> LoopDetectionMiddleware:
|
||||||
@@ -249,20 +213,9 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
"""Extract thread_id from runtime context for per-thread tracking."""
|
"""Extract thread_id from runtime context for per-thread tracking."""
|
||||||
thread_id = runtime.context.get("thread_id") if runtime.context else None
|
thread_id = runtime.context.get("thread_id") if runtime.context else None
|
||||||
if thread_id:
|
if thread_id:
|
||||||
return str(thread_id)
|
return thread_id
|
||||||
return "default"
|
return "default"
|
||||||
|
|
||||||
def _get_run_id(self, runtime: Runtime) -> str:
|
|
||||||
"""Extract run_id from runtime context for per-run warning scoping."""
|
|
||||||
run_id = runtime.context.get("run_id") if runtime.context else None
|
|
||||||
if run_id:
|
|
||||||
return str(run_id)
|
|
||||||
return "default"
|
|
||||||
|
|
||||||
def _pending_key(self, runtime: Runtime) -> tuple[str, str]:
|
|
||||||
"""Return the pending-warning key for the current thread/run."""
|
|
||||||
return self._get_thread_id(runtime), self._get_run_id(runtime)
|
|
||||||
|
|
||||||
def _evict_if_needed(self) -> None:
|
def _evict_if_needed(self) -> None:
|
||||||
"""Evict least recently used threads if over the limit.
|
"""Evict least recently used threads if over the limit.
|
||||||
|
|
||||||
@@ -273,52 +226,8 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
self._warned.pop(evicted_id, None)
|
self._warned.pop(evicted_id, None)
|
||||||
self._tool_freq.pop(evicted_id, None)
|
self._tool_freq.pop(evicted_id, None)
|
||||||
self._tool_freq_warned.pop(evicted_id, None)
|
self._tool_freq_warned.pop(evicted_id, None)
|
||||||
for key in list(self._pending_warnings):
|
|
||||||
if key[0] == evicted_id:
|
|
||||||
self._drop_pending_warning_key_locked(key)
|
|
||||||
logger.debug("Evicted loop tracking for thread %s (LRU)", evicted_id)
|
logger.debug("Evicted loop tracking for thread %s (LRU)", evicted_id)
|
||||||
|
|
||||||
def _drop_pending_warning_key_locked(self, key: tuple[str, str]) -> None:
|
|
||||||
"""Drop all pending-warning bookkeeping for one thread/run key.
|
|
||||||
|
|
||||||
Must be called while holding self._lock.
|
|
||||||
"""
|
|
||||||
self._pending_warnings.pop(key, None)
|
|
||||||
self._pending_warning_touch_order.pop(key, None)
|
|
||||||
|
|
||||||
def _touch_pending_warning_key_locked(self, key: tuple[str, str]) -> None:
|
|
||||||
"""Mark a pending-warning key as recently used.
|
|
||||||
|
|
||||||
Must be called while holding self._lock.
|
|
||||||
"""
|
|
||||||
self._pending_warning_touch_order[key] = None
|
|
||||||
self._pending_warning_touch_order.move_to_end(key)
|
|
||||||
|
|
||||||
def _prune_pending_warning_state_locked(self, protected_key: tuple[str, str]) -> None:
|
|
||||||
"""Cap pending-warning state across abnormal or concurrent runs.
|
|
||||||
|
|
||||||
Must be called while holding self._lock.
|
|
||||||
"""
|
|
||||||
overflow = len(self._pending_warning_touch_order) - self._max_pending_warning_keys
|
|
||||||
if overflow <= 0:
|
|
||||||
return
|
|
||||||
|
|
||||||
candidates = [key for key in self._pending_warning_touch_order if key != protected_key]
|
|
||||||
for key in candidates[:overflow]:
|
|
||||||
self._drop_pending_warning_key_locked(key)
|
|
||||||
|
|
||||||
def _queue_pending_warning(self, runtime: Runtime, warning: str) -> None:
|
|
||||||
"""Queue one transient warning for the current thread/run with caps."""
|
|
||||||
pending_key = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
warnings = self._pending_warnings[pending_key]
|
|
||||||
if warning not in warnings:
|
|
||||||
warnings.append(warning)
|
|
||||||
if len(warnings) > _MAX_PENDING_WARNINGS_PER_RUN:
|
|
||||||
del warnings[: len(warnings) - _MAX_PENDING_WARNINGS_PER_RUN]
|
|
||||||
self._touch_pending_warning_key_locked(pending_key)
|
|
||||||
self._prune_pending_warning_state_locked(protected_key=pending_key)
|
|
||||||
|
|
||||||
def _track_and_check(self, state: AgentState, runtime: Runtime) -> tuple[str | None, bool]:
|
def _track_and_check(self, state: AgentState, runtime: Runtime) -> tuple[str | None, bool]:
|
||||||
"""Track tool calls and check for loops.
|
"""Track tool calls and check for loops.
|
||||||
|
|
||||||
@@ -359,12 +268,6 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
if len(history) > self.window_size:
|
if len(history) > self.window_size:
|
||||||
history[:] = history[-self.window_size :]
|
history[:] = history[-self.window_size :]
|
||||||
|
|
||||||
warned_hashes = self._warned.get(thread_id)
|
|
||||||
if warned_hashes is not None:
|
|
||||||
warned_hashes.intersection_update(history)
|
|
||||||
if not warned_hashes:
|
|
||||||
self._warned.pop(thread_id, None)
|
|
||||||
|
|
||||||
count = history.count(call_hash)
|
count = history.count(call_hash)
|
||||||
tool_names = [tc.get("name", "?") for tc in tool_calls]
|
tool_names = [tc.get("name", "?") for tc in tool_calls]
|
||||||
|
|
||||||
@@ -478,10 +381,7 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
warning, hard_stop = self._track_and_check(state, runtime)
|
warning, hard_stop = self._track_and_check(state, runtime)
|
||||||
|
|
||||||
if hard_stop:
|
if hard_stop:
|
||||||
# Strip tool_calls from the last AIMessage to force text output.
|
# Strip tool_calls from the last AIMessage to force text output
|
||||||
# Once tool_calls are stripped, the AIMessage no longer requires
|
|
||||||
# matching ToolMessage responses, so mutating it in place here
|
|
||||||
# is safe for OpenAI/Moonshot pairing validators.
|
|
||||||
messages = state.get("messages", [])
|
messages = state.get("messages", [])
|
||||||
last_msg = messages[-1]
|
last_msg = messages[-1]
|
||||||
content = self._append_text(last_msg.content, warning or _HARD_STOP_MSG)
|
content = self._append_text(last_msg.content, warning or _HARD_STOP_MSG)
|
||||||
@@ -489,48 +389,33 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
return {"messages": [stripped_msg]}
|
return {"messages": [stripped_msg]}
|
||||||
|
|
||||||
if warning:
|
if warning:
|
||||||
# Defer injection to the next model call. We must NOT alter the
|
# WORKAROUND for v2.0-m1 — see #2724.
|
||||||
# AIMessage(tool_calls=...) here (would put framework words in
|
#
|
||||||
# the model's mouth, polluting downstream consumers like
|
# Append the warning to the AIMessage content instead of
|
||||||
# MemoryMiddleware), nor insert a separate non-tool message
|
# injecting a separate HumanMessage. Inserting any non-tool
|
||||||
# (would break OpenAI/Moonshot tool-call pairing because the
|
# message between an AIMessage(tool_calls=...) and its
|
||||||
# tools node has not produced ToolMessage responses yet). The
|
# ToolMessage responses breaks OpenAI/Moonshot strict pairing
|
||||||
# warning is delivered via ``wrap_model_call`` below.
|
# validation ("tool_call_ids did not have response messages")
|
||||||
self._queue_pending_warning(runtime, warning)
|
# because the tools node has not run yet at after_model time.
|
||||||
return None
|
# tool_calls are preserved so the tools node still executes.
|
||||||
|
#
|
||||||
|
# This is a temporary mitigation: mutating an existing
|
||||||
|
# AIMessage to carry framework-authored text leaks loop-warning
|
||||||
|
# text into downstream consumers (MemoryMiddleware fact
|
||||||
|
# extraction, TitleMiddleware, telemetry, model replay) as if
|
||||||
|
# the model said it. The proper fix is to defer warning
|
||||||
|
# injection from after_model to wrap_model_call so every prior
|
||||||
|
# ToolMessage is already in the request — see RFC #2517 (which
|
||||||
|
# lists "loop intervention does not leave invalid
|
||||||
|
# tool-call/tool-message state" as acceptance criteria) and
|
||||||
|
# the prototype on `fix/loop-detection-tool-call-pairing`.
|
||||||
|
messages = state.get("messages", [])
|
||||||
|
last_msg = messages[-1]
|
||||||
|
patched_msg = last_msg.model_copy(update={"content": self._append_text(last_msg.content, warning)})
|
||||||
|
return {"messages": [patched_msg]}
|
||||||
|
|
||||||
return None
|
return None
|
||||||
|
|
||||||
def _clear_other_run_pending_warnings(self, runtime: Runtime) -> None:
|
|
||||||
"""Drop stale pending warnings for previous runs in this thread."""
|
|
||||||
thread_id, current_run_id = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
for key in list(self._pending_warnings):
|
|
||||||
if key[0] == thread_id and key[1] != current_run_id:
|
|
||||||
self._drop_pending_warning_key_locked(key)
|
|
||||||
|
|
||||||
def _clear_current_run_pending_warnings(self, runtime: Runtime) -> None:
|
|
||||||
"""Drop pending warnings owned by the current thread/run."""
|
|
||||||
pending_key = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
self._drop_pending_warning_key_locked(pending_key)
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _format_warning_message(warnings: list[str]) -> str:
|
|
||||||
"""Merge pending warnings into one prompt message."""
|
|
||||||
deduped = list(dict.fromkeys(warnings))
|
|
||||||
return "\n\n".join(deduped)
|
|
||||||
|
|
||||||
@override
|
|
||||||
def before_agent(self, state: AgentState, runtime: Runtime) -> dict | None:
|
|
||||||
self._clear_other_run_pending_warnings(runtime)
|
|
||||||
return None
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def abefore_agent(self, state: AgentState, runtime: Runtime) -> dict | None:
|
|
||||||
self._clear_other_run_pending_warnings(runtime)
|
|
||||||
return None
|
|
||||||
|
|
||||||
@override
|
@override
|
||||||
def after_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
def after_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
||||||
return self._apply(state, runtime)
|
return self._apply(state, runtime)
|
||||||
@@ -539,59 +424,6 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
async def aafter_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
async def aafter_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
||||||
return self._apply(state, runtime)
|
return self._apply(state, runtime)
|
||||||
|
|
||||||
@override
|
|
||||||
def after_agent(self, state: AgentState, runtime: Runtime) -> dict | None:
|
|
||||||
self._clear_current_run_pending_warnings(runtime)
|
|
||||||
return None
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def aafter_agent(self, state: AgentState, runtime: Runtime) -> dict | None:
|
|
||||||
self._clear_current_run_pending_warnings(runtime)
|
|
||||||
return None
|
|
||||||
|
|
||||||
def _drain_pending_warnings(self, runtime: Runtime) -> list[str]:
|
|
||||||
"""Pop and return all queued warnings for *runtime*'s thread/run."""
|
|
||||||
pending_key = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
warnings = self._pending_warnings.pop(pending_key, [])
|
|
||||||
self._pending_warning_touch_order.pop(pending_key, None)
|
|
||||||
return warnings
|
|
||||||
|
|
||||||
def _augment_request(self, request: ModelRequest) -> ModelRequest:
|
|
||||||
"""Append queued loop warnings (if any) to the outgoing message list.
|
|
||||||
|
|
||||||
The warning is placed *after* every existing message, including the
|
|
||||||
ToolMessage responses to the previous AIMessage(tool_calls). This
|
|
||||||
keeps ``assistant tool_calls -> tool_messages`` pairing intact for
|
|
||||||
OpenAI/Moonshot, avoids the Anthropic mid-stream SystemMessage
|
|
||||||
restriction (we use HumanMessage), and never mutates an existing
|
|
||||||
AIMessage.
|
|
||||||
"""
|
|
||||||
warnings = self._drain_pending_warnings(request.runtime)
|
|
||||||
if not warnings:
|
|
||||||
return request
|
|
||||||
new_messages = [
|
|
||||||
*request.messages,
|
|
||||||
HumanMessage(content=self._format_warning_message(warnings), name="loop_warning"),
|
|
||||||
]
|
|
||||||
return request.override(messages=new_messages)
|
|
||||||
|
|
||||||
@override
|
|
||||||
def wrap_model_call(
|
|
||||||
self,
|
|
||||||
request: ModelRequest,
|
|
||||||
handler: Callable[[ModelRequest], ModelResponse],
|
|
||||||
) -> ModelCallResult:
|
|
||||||
return handler(self._augment_request(request))
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def awrap_model_call(
|
|
||||||
self,
|
|
||||||
request: ModelRequest,
|
|
||||||
handler: Callable[[ModelRequest], Awaitable[ModelResponse]],
|
|
||||||
) -> ModelCallResult:
|
|
||||||
return await handler(self._augment_request(request))
|
|
||||||
|
|
||||||
def reset(self, thread_id: str | None = None) -> None:
|
def reset(self, thread_id: str | None = None) -> None:
|
||||||
"""Clear tracking state. If thread_id given, clear only that thread."""
|
"""Clear tracking state. If thread_id given, clear only that thread."""
|
||||||
with self._lock:
|
with self._lock:
|
||||||
@@ -600,13 +432,8 @@ class LoopDetectionMiddleware(AgentMiddleware[AgentState]):
|
|||||||
self._warned.pop(thread_id, None)
|
self._warned.pop(thread_id, None)
|
||||||
self._tool_freq.pop(thread_id, None)
|
self._tool_freq.pop(thread_id, None)
|
||||||
self._tool_freq_warned.pop(thread_id, None)
|
self._tool_freq_warned.pop(thread_id, None)
|
||||||
for key in list(self._pending_warnings):
|
|
||||||
if key[0] == thread_id:
|
|
||||||
self._drop_pending_warning_key_locked(key)
|
|
||||||
else:
|
else:
|
||||||
self._history.clear()
|
self._history.clear()
|
||||||
self._warned.clear()
|
self._warned.clear()
|
||||||
self._tool_freq.clear()
|
self._tool_freq.clear()
|
||||||
self._tool_freq_warned.clear()
|
self._tool_freq_warned.clear()
|
||||||
self._pending_warnings.clear()
|
|
||||||
self._pending_warning_touch_order.clear()
|
|
||||||
|
|||||||
-317
@@ -1,317 +0,0 @@
|
|||||||
"""Suppress tool execution when the provider safety-terminated the response.
|
|
||||||
|
|
||||||
Background — see issue bytedance/deer-flow#3028.
|
|
||||||
|
|
||||||
Some providers (OpenAI ``finish_reason='content_filter'``, Anthropic
|
|
||||||
``stop_reason='refusal'``, Gemini ``finish_reason='SAFETY'`` ...) can stop
|
|
||||||
generation mid-stream while still returning partially-formed ``tool_calls``.
|
|
||||||
LangChain's tool router treats any AIMessage with a non-empty ``tool_calls``
|
|
||||||
field as "go execute these", so half-truncated arguments — e.g. a markdown
|
|
||||||
``write_file`` that stops in the middle of a sentence — get dispatched as if
|
|
||||||
they were complete. The agent then sees the truncated file, tries to fix it,
|
|
||||||
gets filtered again, and loops.
|
|
||||||
|
|
||||||
This middleware sits at ``after_model`` and gates that behaviour: when a
|
|
||||||
configured ``SafetyTerminationDetector`` fires *and* the AIMessage carries
|
|
||||||
tool calls, we strip the tool calls (both structured and raw provider
|
|
||||||
payloads), append a user-facing explanation, and stash observability fields
|
|
||||||
in ``additional_kwargs.safety_termination`` so logs, traces, and SSE
|
|
||||||
consumers can see what happened.
|
|
||||||
|
|
||||||
Hook choice: ``after_model`` (not ``wrap_model_call``) because the response
|
|
||||||
is a *normal* return — not an exception — and we want to participate in the
|
|
||||||
same after-model chain as ``LoopDetectionMiddleware``, with which we share
|
|
||||||
the same tool-call-suppression mechanic but a different trigger.
|
|
||||||
|
|
||||||
Placement: register *after* ``LoopDetectionMiddleware`` in the middleware
|
|
||||||
list. LangChain factory wires ``after_model`` edges in reverse list order
|
|
||||||
(``langchain/agents/factory.py:add_edge("model", middleware_w_after_model[-1])``,
|
|
||||||
then walks ``range(len-1, 0, -1)``), so the *last* registered middleware is
|
|
||||||
the *first* to observe the model output. Registering Safety after Loop
|
|
||||||
means Safety sees the raw response first, clears tool calls if it fires,
|
|
||||||
and Loop then accounts against the cleaned message.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import logging
|
|
||||||
from typing import TYPE_CHECKING, override
|
|
||||||
|
|
||||||
from langchain.agents import AgentState
|
|
||||||
from langchain.agents.middleware import AgentMiddleware
|
|
||||||
from langchain_core.messages import AIMessage
|
|
||||||
from langgraph.runtime import Runtime
|
|
||||||
|
|
||||||
from deerflow.agents.middlewares.safety_termination_detectors import (
|
|
||||||
SafetyTermination,
|
|
||||||
SafetyTerminationDetector,
|
|
||||||
default_detectors,
|
|
||||||
)
|
|
||||||
from deerflow.agents.middlewares.tool_call_metadata import clone_ai_message_with_tool_calls
|
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
|
||||||
from deerflow.config.safety_finish_reason_config import SafetyFinishReasonConfig
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
_USER_FACING_MESSAGE = (
|
|
||||||
"The model provider stopped this response with a safety-related signal "
|
|
||||||
"({reason_field}={reason_value!r}, detector={detector!r}). Any tool "
|
|
||||||
"calls produced in this turn were suppressed because their arguments "
|
|
||||||
"may be truncated and unsafe to execute. Please rephrase the request "
|
|
||||||
"or ask for a narrower output."
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class SafetyFinishReasonMiddleware(AgentMiddleware[AgentState]):
|
|
||||||
"""Strip tool_calls from AIMessages flagged by a SafetyTerminationDetector."""
|
|
||||||
|
|
||||||
def __init__(self, detectors: list[SafetyTerminationDetector] | None = None) -> None:
|
|
||||||
super().__init__()
|
|
||||||
# Copy so caller mutations after construction don't leak into us.
|
|
||||||
self._detectors: list[SafetyTerminationDetector] = list(detectors) if detectors else default_detectors()
|
|
||||||
|
|
||||||
@classmethod
|
|
||||||
def from_config(cls, config: SafetyFinishReasonConfig) -> SafetyFinishReasonMiddleware:
|
|
||||||
"""Construct from validated Pydantic config, honouring the
|
|
||||||
reflection-loaded detector list when provided.
|
|
||||||
|
|
||||||
An explicit empty list is intentionally rejected — it would silently
|
|
||||||
disable detection while leaving the middleware in the chain, which
|
|
||||||
is the worst of both worlds. Use ``enabled: false`` instead.
|
|
||||||
"""
|
|
||||||
if config.detectors is None:
|
|
||||||
return cls()
|
|
||||||
|
|
||||||
if not config.detectors:
|
|
||||||
raise ValueError("safety_finish_reason.detectors must be omitted (use built-ins) or contain at least one entry; use enabled=false to disable the middleware entirely.")
|
|
||||||
|
|
||||||
from deerflow.reflection import resolve_variable
|
|
||||||
|
|
||||||
detectors: list[SafetyTerminationDetector] = []
|
|
||||||
for entry in config.detectors:
|
|
||||||
detector_cls = resolve_variable(entry.use)
|
|
||||||
kwargs = dict(entry.config) if entry.config else {}
|
|
||||||
detector = detector_cls(**kwargs)
|
|
||||||
if not isinstance(detector, SafetyTerminationDetector):
|
|
||||||
raise TypeError(f"{entry.use} did not produce a SafetyTerminationDetector (got {type(detector).__name__}); ensure it has a `name` attribute and a `detect(message)` method")
|
|
||||||
detectors.append(detector)
|
|
||||||
return cls(detectors=detectors)
|
|
||||||
|
|
||||||
# ----- detection -------------------------------------------------------
|
|
||||||
|
|
||||||
def _detect(self, message: AIMessage) -> SafetyTermination | None:
|
|
||||||
for detector in self._detectors:
|
|
||||||
try:
|
|
||||||
hit = detector.detect(message)
|
|
||||||
except Exception: # noqa: BLE001 - never let a buggy detector break the agent run
|
|
||||||
logger.exception("SafetyTerminationDetector %r raised; treating as no-match", getattr(detector, "name", type(detector).__name__))
|
|
||||||
continue
|
|
||||||
if hit is not None:
|
|
||||||
return hit
|
|
||||||
return None
|
|
||||||
|
|
||||||
# ----- message rewriting ----------------------------------------------
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _append_user_message(content: object, text: str) -> str | list:
|
|
||||||
"""Append a plain-text explanation to AIMessage content.
|
|
||||||
|
|
||||||
Mirrors ``LoopDetectionMiddleware._append_text`` so list-content
|
|
||||||
responses (Anthropic thinking blocks, vLLM reasoning splits) keep
|
|
||||||
their structure instead of being string-coerced into a TypeError.
|
|
||||||
"""
|
|
||||||
if content is None or content == "":
|
|
||||||
return text
|
|
||||||
if isinstance(content, list):
|
|
||||||
return [*content, {"type": "text", "text": f"\n\n{text}"}]
|
|
||||||
if isinstance(content, str):
|
|
||||||
return content + f"\n\n{text}"
|
|
||||||
return str(content) + f"\n\n{text}"
|
|
||||||
|
|
||||||
def _build_suppressed_message(
|
|
||||||
self,
|
|
||||||
message: AIMessage,
|
|
||||||
termination: SafetyTermination,
|
|
||||||
) -> AIMessage:
|
|
||||||
suppressed_names = [tc.get("name") or "unknown" for tc in (message.tool_calls or [])]
|
|
||||||
explanation = _USER_FACING_MESSAGE.format(
|
|
||||||
reason_field=termination.reason_field,
|
|
||||||
reason_value=termination.reason_value,
|
|
||||||
detector=termination.detector,
|
|
||||||
)
|
|
||||||
new_content = self._append_user_message(message.content, explanation)
|
|
||||||
|
|
||||||
# clone_ai_message_with_tool_calls handles structured tool_calls,
|
|
||||||
# raw additional_kwargs.tool_calls, and function_call in one shot.
|
|
||||||
# It only rewrites finish_reason when the old value was "tool_calls",
|
|
||||||
# which is not our case — content_filter / refusal / SAFETY stay put
|
|
||||||
# so downstream SSE / converters keep seeing the real provider reason.
|
|
||||||
cleared = clone_ai_message_with_tool_calls(message, [], content=new_content)
|
|
||||||
|
|
||||||
# Re-clone additional_kwargs so we don't accidentally mutate the
|
|
||||||
# dict returned by clone_ai_message_with_tool_calls (which already
|
|
||||||
# made a shallow copy, but downstream model_copy still references
|
|
||||||
# it). Then stamp the observability record.
|
|
||||||
kwargs = dict(getattr(cleared, "additional_kwargs", None) or {})
|
|
||||||
kwargs["safety_termination"] = {
|
|
||||||
"detector": termination.detector,
|
|
||||||
"reason_field": termination.reason_field,
|
|
||||||
"reason_value": termination.reason_value,
|
|
||||||
"suppressed_tool_call_count": len(suppressed_names),
|
|
||||||
"suppressed_tool_call_names": suppressed_names,
|
|
||||||
"extras": dict(termination.extras) if termination.extras else {},
|
|
||||||
}
|
|
||||||
return cleared.model_copy(update={"additional_kwargs": kwargs})
|
|
||||||
|
|
||||||
# ----- observability ---------------------------------------------------
|
|
||||||
|
|
||||||
def _emit_event(
|
|
||||||
self,
|
|
||||||
termination: SafetyTermination,
|
|
||||||
suppressed_names: list[str],
|
|
||||||
runtime: Runtime,
|
|
||||||
) -> None:
|
|
||||||
"""Notify SSE consumers (e.g. the web UI) that a tool turn was
|
|
||||||
suppressed so they can reconcile any "tool starting..." placeholders
|
|
||||||
already streamed to the user. Failures are logged at debug and
|
|
||||||
ignored — this is a best-effort signal."""
|
|
||||||
try:
|
|
||||||
from langgraph.config import get_stream_writer
|
|
||||||
|
|
||||||
writer = get_stream_writer()
|
|
||||||
except Exception: # noqa: BLE001
|
|
||||||
logger.debug("get_stream_writer unavailable; skipping safety_termination event", exc_info=True)
|
|
||||||
return
|
|
||||||
|
|
||||||
thread_id = None
|
|
||||||
if runtime is not None and getattr(runtime, "context", None):
|
|
||||||
thread_id = runtime.context.get("thread_id") if isinstance(runtime.context, dict) else None
|
|
||||||
|
|
||||||
try:
|
|
||||||
writer(
|
|
||||||
{
|
|
||||||
"type": "safety_termination",
|
|
||||||
"detector": termination.detector,
|
|
||||||
"reason_field": termination.reason_field,
|
|
||||||
"reason_value": termination.reason_value,
|
|
||||||
"suppressed_tool_call_count": len(suppressed_names),
|
|
||||||
"suppressed_tool_call_names": suppressed_names,
|
|
||||||
"thread_id": thread_id,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
except Exception: # noqa: BLE001
|
|
||||||
logger.debug("Failed to emit safety_termination stream event", exc_info=True)
|
|
||||||
|
|
||||||
def _record_audit_event(
|
|
||||||
self,
|
|
||||||
termination: SafetyTermination,
|
|
||||||
message,
|
|
||||||
tool_calls: list[dict],
|
|
||||||
runtime: Runtime,
|
|
||||||
) -> None:
|
|
||||||
"""Write a ``middleware:safety_termination`` record to RunEventStore
|
|
||||||
for post-run auditability.
|
|
||||||
|
|
||||||
The custom stream event in ``_emit_event`` is consumed by live SSE
|
|
||||||
clients and disappears after the run; this event is persisted so an
|
|
||||||
operator can answer "which runs were safety-suppressed today?" from
|
|
||||||
a single SQL query without joining the message body. Worker exposes
|
|
||||||
the run-scoped ``RunJournal`` via ``runtime.context["__run_journal"]``;
|
|
||||||
absent in unit-test / subagent / no-event-store paths, in which case
|
|
||||||
we silently skip.
|
|
||||||
|
|
||||||
Tool **arguments** are deliberately **not** recorded — those are the
|
|
||||||
very content the provider filtered; persisting them would defeat the
|
|
||||||
purpose of the safety filter. Names / count / ids are sufficient for
|
|
||||||
audit and debugging (issue #3028 review).
|
|
||||||
"""
|
|
||||||
journal = None
|
|
||||||
if runtime is not None and getattr(runtime, "context", None):
|
|
||||||
context = runtime.context
|
|
||||||
if isinstance(context, dict):
|
|
||||||
journal = context.get("__run_journal")
|
|
||||||
if journal is None:
|
|
||||||
return
|
|
||||||
|
|
||||||
suppressed_names = [tc.get("name") or "unknown" for tc in tool_calls]
|
|
||||||
suppressed_ids = [tc.get("id") for tc in tool_calls if tc.get("id")]
|
|
||||||
|
|
||||||
changes = {
|
|
||||||
"detector": termination.detector,
|
|
||||||
"reason_field": termination.reason_field,
|
|
||||||
"reason_value": termination.reason_value,
|
|
||||||
"suppressed_tool_call_count": len(tool_calls),
|
|
||||||
"suppressed_tool_call_names": suppressed_names,
|
|
||||||
"suppressed_tool_call_ids": suppressed_ids,
|
|
||||||
"message_id": getattr(message, "id", None),
|
|
||||||
"extras": dict(termination.extras) if termination.extras else {},
|
|
||||||
}
|
|
||||||
|
|
||||||
try:
|
|
||||||
journal.record_middleware(
|
|
||||||
tag="safety_termination",
|
|
||||||
name=type(self).__name__,
|
|
||||||
hook="after_model",
|
|
||||||
action="suppress_tool_calls",
|
|
||||||
changes=changes,
|
|
||||||
)
|
|
||||||
except Exception: # noqa: BLE001
|
|
||||||
# Audit-event persistence must never break agent execution.
|
|
||||||
logger.debug("Failed to record middleware:safety_termination event", exc_info=True)
|
|
||||||
|
|
||||||
# ----- main apply ------------------------------------------------------
|
|
||||||
|
|
||||||
def _apply(self, state: AgentState, runtime: Runtime) -> dict | None:
|
|
||||||
messages = state.get("messages", [])
|
|
||||||
if not messages:
|
|
||||||
return None
|
|
||||||
|
|
||||||
last = messages[-1]
|
|
||||||
if not isinstance(last, AIMessage):
|
|
||||||
return None
|
|
||||||
|
|
||||||
# Issue scope: only intervene when there's something to suppress.
|
|
||||||
# ``content_filter`` without tool_calls is allowed through unchanged
|
|
||||||
# so the partial text response (if any) reaches the user naturally.
|
|
||||||
tool_calls = last.tool_calls
|
|
||||||
if not tool_calls:
|
|
||||||
return None
|
|
||||||
|
|
||||||
termination = self._detect(last)
|
|
||||||
if termination is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
patched = self._build_suppressed_message(last, termination)
|
|
||||||
|
|
||||||
thread_id = None
|
|
||||||
if runtime is not None and getattr(runtime, "context", None):
|
|
||||||
thread_id = runtime.context.get("thread_id") if isinstance(runtime.context, dict) else None
|
|
||||||
|
|
||||||
logger.warning(
|
|
||||||
"Provider safety termination detected — suppressed %d tool call(s)",
|
|
||||||
len(tool_calls),
|
|
||||||
extra={
|
|
||||||
"thread_id": thread_id,
|
|
||||||
"detector": termination.detector,
|
|
||||||
"reason_field": termination.reason_field,
|
|
||||||
"reason_value": termination.reason_value,
|
|
||||||
"suppressed_tool_call_names": [tc.get("name") for tc in tool_calls],
|
|
||||||
},
|
|
||||||
)
|
|
||||||
|
|
||||||
self._emit_event(termination, [tc.get("name") or "unknown" for tc in tool_calls], runtime)
|
|
||||||
self._record_audit_event(termination, last, list(tool_calls), runtime)
|
|
||||||
|
|
||||||
return {"messages": [patched]}
|
|
||||||
|
|
||||||
# ----- hooks -----------------------------------------------------------
|
|
||||||
|
|
||||||
@override
|
|
||||||
def after_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
|
||||||
return self._apply(state, runtime)
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def aafter_model(self, state: AgentState, runtime: Runtime) -> dict | None:
|
|
||||||
return self._apply(state, runtime)
|
|
||||||
@@ -1,237 +0,0 @@
|
|||||||
"""Detectors for provider-side safety termination signals.
|
|
||||||
|
|
||||||
Different LLM providers signal "I stopped this response for safety reasons"
|
|
||||||
through different fields with different values. This module defines a small
|
|
||||||
strategy interface and three built-in detectors that cover the major
|
|
||||||
providers DeerFlow supports today. New providers (Wenxin, Hunyuan, Bedrock
|
|
||||||
adapters, in-house gateways, ...) can be added by implementing
|
|
||||||
``SafetyTerminationDetector`` and wiring it through
|
|
||||||
``config.yaml: safety_finish_reason.detectors``.
|
|
||||||
|
|
||||||
The middleware that consumes these detectors lives in
|
|
||||||
``safety_finish_reason_middleware.py``.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from dataclasses import dataclass, field
|
|
||||||
from typing import Any, Protocol, runtime_checkable
|
|
||||||
|
|
||||||
from langchain_core.messages import AIMessage
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class SafetyTermination:
|
|
||||||
"""A detected safety-related termination signal.
|
|
||||||
|
|
||||||
Attributes:
|
|
||||||
detector: Name of the detector that produced this result. Used for
|
|
||||||
observability so operators can see which provider rule fired.
|
|
||||||
reason_field: The message metadata field that carried the signal
|
|
||||||
(e.g. ``finish_reason``, ``stop_reason``).
|
|
||||||
reason_value: The actual value of that field
|
|
||||||
(e.g. ``content_filter``, ``refusal``, ``SAFETY``).
|
|
||||||
extras: Provider-specific metadata that may help downstream
|
|
||||||
consumers (e.g. Azure OpenAI content_filter_results, Gemini
|
|
||||||
safety_ratings). Detectors are free to populate or skip this.
|
|
||||||
"""
|
|
||||||
|
|
||||||
detector: str
|
|
||||||
reason_field: str
|
|
||||||
reason_value: str
|
|
||||||
extras: dict[str, Any] = field(default_factory=dict)
|
|
||||||
|
|
||||||
|
|
||||||
@runtime_checkable
|
|
||||||
class SafetyTerminationDetector(Protocol):
|
|
||||||
"""Strategy interface for provider safety termination detection."""
|
|
||||||
|
|
||||||
name: str
|
|
||||||
|
|
||||||
def detect(self, message: AIMessage) -> SafetyTermination | None:
|
|
||||||
"""Return a SafetyTermination if *message* indicates provider safety
|
|
||||||
termination, otherwise return ``None``.
|
|
||||||
|
|
||||||
Implementations must be side-effect free and tolerant of missing or
|
|
||||||
oddly-typed metadata — detectors run on every model response.
|
|
||||||
"""
|
|
||||||
...
|
|
||||||
|
|
||||||
|
|
||||||
def _get_metadata_value(message: AIMessage, field_name: str) -> str | None:
|
|
||||||
"""Read a string-typed value from either ``response_metadata`` or
|
|
||||||
``additional_kwargs``.
|
|
||||||
|
|
||||||
LangChain provider adapters are inconsistent about where they stash
|
|
||||||
provider stop signals. Most modern adapters use ``response_metadata``,
|
|
||||||
but some legacy / passthrough paths still surface them via
|
|
||||||
``additional_kwargs``. We check both, in that order, and only accept
|
|
||||||
string values — Pydantic enums or dicts are ignored so we never raise
|
|
||||||
on malformed inputs.
|
|
||||||
"""
|
|
||||||
for container_name in ("response_metadata", "additional_kwargs"):
|
|
||||||
container = getattr(message, container_name, None) or {}
|
|
||||||
if not isinstance(container, dict):
|
|
||||||
continue
|
|
||||||
value = container.get(field_name)
|
|
||||||
if isinstance(value, str) and value:
|
|
||||||
return value
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
class OpenAICompatibleContentFilterDetector:
|
|
||||||
"""OpenAI-compatible content_filter signal.
|
|
||||||
|
|
||||||
Covers OpenAI, Azure OpenAI, Moonshot/Kimi, DeepSeek, Mistral, vLLM,
|
|
||||||
Qwen (OpenAI-compatible mode), and any other adapter that follows the
|
|
||||||
OpenAI ``finish_reason`` convention.
|
|
||||||
|
|
||||||
Some Chinese providers ship custom OpenAI-compatible gateways that use
|
|
||||||
alternative tokens like ``sensitive`` or ``violation``. Extend the set
|
|
||||||
via the ``finish_reasons`` kwarg in config.
|
|
||||||
"""
|
|
||||||
|
|
||||||
name = "openai_compatible_content_filter"
|
|
||||||
|
|
||||||
def __init__(self, finish_reasons: list[str] | tuple[str, ...] | None = None) -> None:
|
|
||||||
configured = finish_reasons if finish_reasons is not None else ("content_filter",)
|
|
||||||
self._finish_reasons: frozenset[str] = frozenset(r.lower() for r in configured)
|
|
||||||
|
|
||||||
def detect(self, message: AIMessage) -> SafetyTermination | None:
|
|
||||||
value = _get_metadata_value(message, "finish_reason")
|
|
||||||
if value is None or value.lower() not in self._finish_reasons:
|
|
||||||
return None
|
|
||||||
|
|
||||||
extras: dict[str, Any] = {}
|
|
||||||
# Azure OpenAI ships a structured content_filter_results block; carry it
|
|
||||||
# through so operators can see *what* was filtered without re-tracing.
|
|
||||||
response_metadata = getattr(message, "response_metadata", None) or {}
|
|
||||||
if isinstance(response_metadata, dict):
|
|
||||||
filter_results = response_metadata.get("content_filter_results")
|
|
||||||
if filter_results:
|
|
||||||
extras["content_filter_results"] = filter_results
|
|
||||||
|
|
||||||
return SafetyTermination(
|
|
||||||
detector=self.name,
|
|
||||||
reason_field="finish_reason",
|
|
||||||
reason_value=value,
|
|
||||||
extras=extras,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class AnthropicRefusalDetector:
|
|
||||||
"""Anthropic ``stop_reason == "refusal"`` signal.
|
|
||||||
|
|
||||||
Anthropic models surface safety refusals via a dedicated ``stop_reason``
|
|
||||||
rather than ``finish_reason``. See:
|
|
||||||
https://platform.claude.com/docs/en/test-and-evaluate/strengthen-guardrails/handle-streaming-refusals
|
|
||||||
"""
|
|
||||||
|
|
||||||
name = "anthropic_refusal"
|
|
||||||
|
|
||||||
def __init__(self, stop_reasons: list[str] | tuple[str, ...] | None = None) -> None:
|
|
||||||
configured = stop_reasons if stop_reasons is not None else ("refusal",)
|
|
||||||
self._stop_reasons: frozenset[str] = frozenset(r.lower() for r in configured)
|
|
||||||
|
|
||||||
def detect(self, message: AIMessage) -> SafetyTermination | None:
|
|
||||||
value = _get_metadata_value(message, "stop_reason")
|
|
||||||
if value is None or value.lower() not in self._stop_reasons:
|
|
||||||
return None
|
|
||||||
return SafetyTermination(
|
|
||||||
detector=self.name,
|
|
||||||
reason_field="stop_reason",
|
|
||||||
reason_value=value,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class GeminiSafetyDetector:
|
|
||||||
"""Gemini / Vertex AI safety-related finish reasons.
|
|
||||||
|
|
||||||
Gemini uses the same ``finish_reason`` field as OpenAI but with an
|
|
||||||
enumerated upper-case taxonomy. The default set covers every Gemini
|
|
||||||
finish_reason that means "the model stopped because the content/image
|
|
||||||
tripped a safety, blocklist, recitation, or PII filter" — i.e. cases
|
|
||||||
where any tool_calls returned alongside are likely truncated/
|
|
||||||
unreliable. Full enum:
|
|
||||||
https://docs.cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform_v1.types.Candidate.FinishReason
|
|
||||||
|
|
||||||
Intentionally **excluded** from the default set:
|
|
||||||
- ``STOP`` — normal termination.
|
|
||||||
- ``MAX_TOKENS`` — output length truncation, not safety
|
|
||||||
(same root failure mode as
|
|
||||||
content_filter, but issue #3028
|
|
||||||
scopes it out; expose separately if
|
|
||||||
desired).
|
|
||||||
- ``LANGUAGE`` / ``NO_IMAGE`` — capability mismatches, unrelated to
|
|
||||||
safety; tool_calls would be absent
|
|
||||||
anyway.
|
|
||||||
- ``MALFORMED_FUNCTION_CALL`` /
|
|
||||||
``UNEXPECTED_TOOL_CALL`` — tool-call protocol errors. The
|
|
||||||
tool_calls are *also* unreliable
|
|
||||||
here, but the failure category is
|
|
||||||
distinct from safety filtering;
|
|
||||||
handle in a dedicated detector to
|
|
||||||
keep observability records honest.
|
|
||||||
- ``OTHER`` / ``IMAGE_OTHER`` /
|
|
||||||
``FINISH_REASON_UNSPECIFIED`` — too broad to enable by default;
|
|
||||||
opt in via ``finish_reasons=`` if
|
|
||||||
your provider abuses these.
|
|
||||||
"""
|
|
||||||
|
|
||||||
name = "gemini_safety"
|
|
||||||
|
|
||||||
_DEFAULT_FINISH_REASONS = (
|
|
||||||
# Text safety
|
|
||||||
"SAFETY",
|
|
||||||
"BLOCKLIST",
|
|
||||||
"PROHIBITED_CONTENT",
|
|
||||||
"SPII",
|
|
||||||
"RECITATION",
|
|
||||||
# Image safety (multimodal generation)
|
|
||||||
"IMAGE_SAFETY",
|
|
||||||
"IMAGE_PROHIBITED_CONTENT",
|
|
||||||
"IMAGE_RECITATION",
|
|
||||||
)
|
|
||||||
|
|
||||||
def __init__(self, finish_reasons: list[str] | tuple[str, ...] | None = None) -> None:
|
|
||||||
configured = finish_reasons if finish_reasons is not None else self._DEFAULT_FINISH_REASONS
|
|
||||||
self._finish_reasons: frozenset[str] = frozenset(r.upper() for r in configured)
|
|
||||||
|
|
||||||
def detect(self, message: AIMessage) -> SafetyTermination | None:
|
|
||||||
value = _get_metadata_value(message, "finish_reason")
|
|
||||||
if value is None or value.upper() not in self._finish_reasons:
|
|
||||||
return None
|
|
||||||
|
|
||||||
extras: dict[str, Any] = {}
|
|
||||||
response_metadata = getattr(message, "response_metadata", None) or {}
|
|
||||||
if isinstance(response_metadata, dict):
|
|
||||||
# Gemini surfaces per-category scoring under safety_ratings.
|
|
||||||
ratings = response_metadata.get("safety_ratings")
|
|
||||||
if ratings:
|
|
||||||
extras["safety_ratings"] = ratings
|
|
||||||
|
|
||||||
return SafetyTermination(
|
|
||||||
detector=self.name,
|
|
||||||
reason_field="finish_reason",
|
|
||||||
reason_value=value,
|
|
||||||
extras=extras,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def default_detectors() -> list[SafetyTerminationDetector]:
|
|
||||||
"""Built-in detector set used when no custom detectors are configured."""
|
|
||||||
return [
|
|
||||||
OpenAICompatibleContentFilterDetector(),
|
|
||||||
AnthropicRefusalDetector(),
|
|
||||||
GeminiSafetyDetector(),
|
|
||||||
]
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"AnthropicRefusalDetector",
|
|
||||||
"GeminiSafetyDetector",
|
|
||||||
"OpenAICompatibleContentFilterDetector",
|
|
||||||
"SafetyTermination",
|
|
||||||
"SafetyTerminationDetector",
|
|
||||||
"default_detectors",
|
|
||||||
]
|
|
||||||
@@ -160,11 +160,7 @@ class TitleMiddleware(AgentMiddleware[TitleMiddlewareState]):
|
|||||||
prompt, user_msg = self._build_title_prompt(state)
|
prompt, user_msg = self._build_title_prompt(state)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# attach_tracing=False because ``_get_runnable_config()`` inherits
|
model_kwargs = {"thinking_enabled": False}
|
||||||
# the graph-level RunnableConfig (set in ``_make_lead_agent``) whose
|
|
||||||
# callbacks already carry tracing handlers; binding them again at
|
|
||||||
# the model level would emit duplicate spans.
|
|
||||||
model_kwargs = {"thinking_enabled": False, "attach_tracing": False}
|
|
||||||
if self._app_config is not None:
|
if self._app_config is not None:
|
||||||
model_kwargs["app_config"] = self._app_config
|
model_kwargs["app_config"] = self._app_config
|
||||||
if config.model_name:
|
if config.model_name:
|
||||||
|
|||||||
@@ -7,26 +7,20 @@ reminder message so the model still knows about the outstanding todo list.
|
|||||||
|
|
||||||
Additionally, this middleware prevents the agent from exiting the loop while
|
Additionally, this middleware prevents the agent from exiting the loop while
|
||||||
there are still incomplete todo items. When the model produces a final response
|
there are still incomplete todo items. When the model produces a final response
|
||||||
(no tool calls) but todos are not yet complete, the middleware queues a reminder
|
(no tool calls) but todos are not yet complete, the middleware injects a reminder
|
||||||
for the next model request and jumps back to the model node to force continued
|
and jumps back to the model node to force continued engagement.
|
||||||
engagement. The completion reminder is injected via ``wrap_model_call`` instead
|
|
||||||
of being persisted into graph state as a normal user-visible message.
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import threading
|
|
||||||
from collections.abc import Awaitable, Callable
|
|
||||||
from typing import Any, override
|
from typing import Any, override
|
||||||
|
|
||||||
from langchain.agents.middleware import TodoListMiddleware
|
from langchain.agents.middleware import TodoListMiddleware
|
||||||
from langchain.agents.middleware.todo import Todo
|
from langchain.agents.middleware.todo import PlanningState, Todo
|
||||||
from langchain.agents.middleware.types import ModelCallResult, ModelRequest, ModelResponse, hook_config
|
from langchain.agents.middleware.types import hook_config
|
||||||
from langchain_core.messages import AIMessage, HumanMessage
|
from langchain_core.messages import AIMessage, HumanMessage
|
||||||
from langgraph.runtime import Runtime
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
from deerflow.agents.thread_state import ThreadState
|
|
||||||
|
|
||||||
|
|
||||||
def _todos_in_messages(messages: list[Any]) -> bool:
|
def _todos_in_messages(messages: list[Any]) -> bool:
|
||||||
"""Return True if any AIMessage in *messages* contains a write_todos tool call."""
|
"""Return True if any AIMessage in *messages* contains a write_todos tool call."""
|
||||||
@@ -61,51 +55,6 @@ def _format_todos(todos: list[Todo]) -> str:
|
|||||||
return "\n".join(lines)
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
def _format_completion_reminder(todos: list[Todo]) -> str:
|
|
||||||
"""Format a completion reminder for incomplete todo items."""
|
|
||||||
incomplete = [t for t in todos if t.get("status") != "completed"]
|
|
||||||
incomplete_text = "\n".join(f"- [{t.get('status', 'pending')}] {t.get('content', '')}" for t in incomplete)
|
|
||||||
return (
|
|
||||||
"<system_reminder>\n"
|
|
||||||
"You have incomplete todo items that must be finished before giving your final response:\n\n"
|
|
||||||
f"{incomplete_text}\n\n"
|
|
||||||
"Please continue working on these tasks. Call `write_todos` to mark items as completed "
|
|
||||||
"as you finish them, and only respond when all items are done.\n"
|
|
||||||
"</system_reminder>"
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
_TOOL_CALL_FINISH_REASONS = {"tool_calls", "function_call"}
|
|
||||||
|
|
||||||
|
|
||||||
def _has_tool_call_intent_or_error(message: AIMessage) -> bool:
|
|
||||||
"""Return True when an AIMessage is not a clean final answer.
|
|
||||||
|
|
||||||
Todo completion reminders should only fire when the model has produced a
|
|
||||||
plain final response. Provider/tool parsing details have moved across
|
|
||||||
LangChain versions and integrations, so keep all tool-intent/error signals
|
|
||||||
behind this helper instead of checking one concrete field at the call site.
|
|
||||||
"""
|
|
||||||
if message.tool_calls:
|
|
||||||
return True
|
|
||||||
|
|
||||||
if getattr(message, "invalid_tool_calls", None):
|
|
||||||
return True
|
|
||||||
|
|
||||||
# Backward/provider compatibility: some integrations preserve raw or legacy
|
|
||||||
# tool-call intent in additional_kwargs even when structured tool_calls is
|
|
||||||
# empty. If this helper changes, update the matching sentinel test
|
|
||||||
# `TestToolCallIntentOrError.test_langchain_ai_message_tool_fields_are_explicitly_handled`;
|
|
||||||
# if that test fails after a LangChain upgrade, review this helper so new
|
|
||||||
# tool-call/error fields are not silently treated as clean final answers.
|
|
||||||
additional_kwargs = getattr(message, "additional_kwargs", {}) or {}
|
|
||||||
if additional_kwargs.get("tool_calls") or additional_kwargs.get("function_call"):
|
|
||||||
return True
|
|
||||||
|
|
||||||
response_metadata = getattr(message, "response_metadata", {}) or {}
|
|
||||||
return response_metadata.get("finish_reason") in _TOOL_CALL_FINISH_REASONS
|
|
||||||
|
|
||||||
|
|
||||||
class TodoMiddleware(TodoListMiddleware):
|
class TodoMiddleware(TodoListMiddleware):
|
||||||
"""Extends TodoListMiddleware with `write_todos` context-loss detection.
|
"""Extends TodoListMiddleware with `write_todos` context-loss detection.
|
||||||
|
|
||||||
@@ -115,12 +64,10 @@ class TodoMiddleware(TodoListMiddleware):
|
|||||||
and injects a reminder message so the model can continue tracking progress.
|
and injects a reminder message so the model can continue tracking progress.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
state_schema = ThreadState
|
|
||||||
|
|
||||||
@override
|
@override
|
||||||
def before_model(
|
def before_model(
|
||||||
self,
|
self,
|
||||||
state: ThreadState,
|
state: PlanningState,
|
||||||
runtime: Runtime,
|
runtime: Runtime,
|
||||||
) -> dict[str, Any] | None:
|
) -> dict[str, Any] | None:
|
||||||
"""Inject a todo-list reminder when write_todos has left the context window."""
|
"""Inject a todo-list reminder when write_todos has left the context window."""
|
||||||
@@ -142,7 +89,6 @@ class TodoMiddleware(TodoListMiddleware):
|
|||||||
formatted = _format_todos(todos)
|
formatted = _format_todos(todos)
|
||||||
reminder = HumanMessage(
|
reminder = HumanMessage(
|
||||||
name="todo_reminder",
|
name="todo_reminder",
|
||||||
additional_kwargs={"hide_from_ui": True},
|
|
||||||
content=(
|
content=(
|
||||||
"<system_reminder>\n"
|
"<system_reminder>\n"
|
||||||
"Your todo list from earlier is no longer visible in the current context window, "
|
"Your todo list from earlier is no longer visible in the current context window, "
|
||||||
@@ -158,7 +104,7 @@ class TodoMiddleware(TodoListMiddleware):
|
|||||||
@override
|
@override
|
||||||
async def abefore_model(
|
async def abefore_model(
|
||||||
self,
|
self,
|
||||||
state: ThreadState,
|
state: PlanningState,
|
||||||
runtime: Runtime,
|
runtime: Runtime,
|
||||||
) -> dict[str, Any] | None:
|
) -> dict[str, Any] | None:
|
||||||
"""Async version of before_model."""
|
"""Async version of before_model."""
|
||||||
@@ -167,106 +113,12 @@ class TodoMiddleware(TodoListMiddleware):
|
|||||||
# Maximum number of completion reminders before allowing the agent to exit.
|
# Maximum number of completion reminders before allowing the agent to exit.
|
||||||
# This prevents infinite loops when the agent cannot make further progress.
|
# This prevents infinite loops when the agent cannot make further progress.
|
||||||
_MAX_COMPLETION_REMINDERS = 2
|
_MAX_COMPLETION_REMINDERS = 2
|
||||||
# Hard cap for per-run reminder bookkeeping in long-lived middleware instances.
|
|
||||||
_MAX_COMPLETION_REMINDER_KEYS = 4096
|
|
||||||
|
|
||||||
def __init__(self, *args: Any, **kwargs: Any) -> None:
|
|
||||||
super().__init__(*args, **kwargs)
|
|
||||||
self._lock = threading.Lock()
|
|
||||||
self._pending_completion_reminders: dict[tuple[str, str], list[str]] = {}
|
|
||||||
self._completion_reminder_counts: dict[tuple[str, str], int] = {}
|
|
||||||
self._completion_reminder_touch_order: dict[tuple[str, str], int] = {}
|
|
||||||
self._completion_reminder_next_order = 0
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _get_thread_id(runtime: Runtime) -> str:
|
|
||||||
context = getattr(runtime, "context", None)
|
|
||||||
thread_id = context.get("thread_id") if context else None
|
|
||||||
return str(thread_id) if thread_id else "default"
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _get_run_id(runtime: Runtime) -> str:
|
|
||||||
context = getattr(runtime, "context", None)
|
|
||||||
run_id = context.get("run_id") if context else None
|
|
||||||
return str(run_id) if run_id else "default"
|
|
||||||
|
|
||||||
def _pending_key(self, runtime: Runtime) -> tuple[str, str]:
|
|
||||||
return self._get_thread_id(runtime), self._get_run_id(runtime)
|
|
||||||
|
|
||||||
def _touch_completion_reminder_key_locked(self, key: tuple[str, str]) -> None:
|
|
||||||
self._completion_reminder_next_order += 1
|
|
||||||
self._completion_reminder_touch_order[key] = self._completion_reminder_next_order
|
|
||||||
|
|
||||||
def _completion_reminder_keys_locked(self) -> set[tuple[str, str]]:
|
|
||||||
keys = set(self._pending_completion_reminders)
|
|
||||||
keys.update(self._completion_reminder_counts)
|
|
||||||
keys.update(self._completion_reminder_touch_order)
|
|
||||||
return keys
|
|
||||||
|
|
||||||
def _drop_completion_reminder_key_locked(self, key: tuple[str, str]) -> None:
|
|
||||||
self._pending_completion_reminders.pop(key, None)
|
|
||||||
self._completion_reminder_counts.pop(key, None)
|
|
||||||
self._completion_reminder_touch_order.pop(key, None)
|
|
||||||
|
|
||||||
def _prune_completion_reminder_state_locked(self, protected_key: tuple[str, str]) -> None:
|
|
||||||
keys = self._completion_reminder_keys_locked()
|
|
||||||
overflow = len(keys) - self._MAX_COMPLETION_REMINDER_KEYS
|
|
||||||
if overflow <= 0:
|
|
||||||
return
|
|
||||||
|
|
||||||
candidates = [key for key in keys if key != protected_key]
|
|
||||||
candidates.sort(key=lambda key: self._completion_reminder_touch_order.get(key, 0))
|
|
||||||
for key in candidates[:overflow]:
|
|
||||||
self._drop_completion_reminder_key_locked(key)
|
|
||||||
|
|
||||||
def _queue_completion_reminder(self, runtime: Runtime, reminder: str) -> None:
|
|
||||||
key = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
self._pending_completion_reminders.setdefault(key, []).append(reminder)
|
|
||||||
self._completion_reminder_counts[key] = self._completion_reminder_counts.get(key, 0) + 1
|
|
||||||
self._touch_completion_reminder_key_locked(key)
|
|
||||||
self._prune_completion_reminder_state_locked(protected_key=key)
|
|
||||||
|
|
||||||
def _completion_reminder_count_for_runtime(self, runtime: Runtime) -> int:
|
|
||||||
key = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
return self._completion_reminder_counts.get(key, 0)
|
|
||||||
|
|
||||||
def _drain_completion_reminders(self, runtime: Runtime) -> list[str]:
|
|
||||||
key = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
reminders = self._pending_completion_reminders.pop(key, [])
|
|
||||||
if reminders or key in self._completion_reminder_counts:
|
|
||||||
self._touch_completion_reminder_key_locked(key)
|
|
||||||
return reminders
|
|
||||||
|
|
||||||
def _clear_other_run_completion_reminders(self, runtime: Runtime) -> None:
|
|
||||||
thread_id, current_run_id = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
for key in self._completion_reminder_keys_locked():
|
|
||||||
if key[0] == thread_id and key[1] != current_run_id:
|
|
||||||
self._drop_completion_reminder_key_locked(key)
|
|
||||||
|
|
||||||
def _clear_current_run_completion_reminders(self, runtime: Runtime) -> None:
|
|
||||||
key = self._pending_key(runtime)
|
|
||||||
with self._lock:
|
|
||||||
self._drop_completion_reminder_key_locked(key)
|
|
||||||
|
|
||||||
@override
|
|
||||||
def before_agent(self, state: ThreadState, runtime: Runtime) -> dict[str, Any] | None:
|
|
||||||
self._clear_other_run_completion_reminders(runtime)
|
|
||||||
return None
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def abefore_agent(self, state: ThreadState, runtime: Runtime) -> dict[str, Any] | None:
|
|
||||||
self._clear_other_run_completion_reminders(runtime)
|
|
||||||
return None
|
|
||||||
|
|
||||||
@hook_config(can_jump_to=["model"])
|
@hook_config(can_jump_to=["model"])
|
||||||
@override
|
@override
|
||||||
def after_model(
|
def after_model(
|
||||||
self,
|
self,
|
||||||
state: ThreadState,
|
state: PlanningState,
|
||||||
runtime: Runtime,
|
runtime: Runtime,
|
||||||
) -> dict[str, Any] | None:
|
) -> dict[str, Any] | None:
|
||||||
"""Prevent premature agent exit when todo items are still incomplete.
|
"""Prevent premature agent exit when todo items are still incomplete.
|
||||||
@@ -285,12 +137,10 @@ class TodoMiddleware(TodoListMiddleware):
|
|||||||
if base_result is not None:
|
if base_result is not None:
|
||||||
return base_result
|
return base_result
|
||||||
|
|
||||||
# 2. Only intervene when the agent wants to exit cleanly. Tool-call
|
# 2. Only intervene when the agent wants to exit (no tool calls).
|
||||||
# intent or tool-call parse errors should be handled by the tool path
|
|
||||||
# instead of being masked by todo reminders.
|
|
||||||
messages = state.get("messages") or []
|
messages = state.get("messages") or []
|
||||||
last_ai = next((m for m in reversed(messages) if isinstance(m, AIMessage)), None)
|
last_ai = next((m for m in reversed(messages) if isinstance(m, AIMessage)), None)
|
||||||
if not last_ai or _has_tool_call_intent_or_error(last_ai):
|
if not last_ai or last_ai.tool_calls:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
# 3. Allow exit when all todos are completed or there are no todos.
|
# 3. Allow exit when all todos are completed or there are no todos.
|
||||||
@@ -299,65 +149,31 @@ class TodoMiddleware(TodoListMiddleware):
|
|||||||
return None
|
return None
|
||||||
|
|
||||||
# 4. Enforce a reminder cap to prevent infinite re-engagement loops.
|
# 4. Enforce a reminder cap to prevent infinite re-engagement loops.
|
||||||
if self._completion_reminder_count_for_runtime(runtime) >= self._MAX_COMPLETION_REMINDERS:
|
if _completion_reminder_count(messages) >= self._MAX_COMPLETION_REMINDERS:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
# 5. Queue a reminder for the next model request and jump back. We must
|
# 5. Inject a reminder and force the agent back to the model.
|
||||||
# not persist this control prompt as a normal HumanMessage, otherwise it
|
incomplete = [t for t in todos if t.get("status") != "completed"]
|
||||||
# can leak into user-visible message streams and saved transcripts.
|
incomplete_text = "\n".join(f"- [{t.get('status', 'pending')}] {t.get('content', '')}" for t in incomplete)
|
||||||
self._queue_completion_reminder(runtime, _format_completion_reminder(todos))
|
reminder = HumanMessage(
|
||||||
return {"jump_to": "model"}
|
name="todo_completion_reminder",
|
||||||
|
content=(
|
||||||
|
"<system_reminder>\n"
|
||||||
|
"You have incomplete todo items that must be finished before giving your final response:\n\n"
|
||||||
|
f"{incomplete_text}\n\n"
|
||||||
|
"Please continue working on these tasks. Call `write_todos` to mark items as completed "
|
||||||
|
"as you finish them, and only respond when all items are done.\n"
|
||||||
|
"</system_reminder>"
|
||||||
|
),
|
||||||
|
)
|
||||||
|
return {"jump_to": "model", "messages": [reminder]}
|
||||||
|
|
||||||
@override
|
@override
|
||||||
@hook_config(can_jump_to=["model"])
|
@hook_config(can_jump_to=["model"])
|
||||||
async def aafter_model(
|
async def aafter_model(
|
||||||
self,
|
self,
|
||||||
state: ThreadState,
|
state: PlanningState,
|
||||||
runtime: Runtime,
|
runtime: Runtime,
|
||||||
) -> dict[str, Any] | None:
|
) -> dict[str, Any] | None:
|
||||||
"""Async version of after_model."""
|
"""Async version of after_model."""
|
||||||
return self.after_model(state, runtime)
|
return self.after_model(state, runtime)
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _format_pending_completion_reminders(reminders: list[str]) -> str:
|
|
||||||
return "\n\n".join(dict.fromkeys(reminders))
|
|
||||||
|
|
||||||
def _augment_request(self, request: ModelRequest) -> ModelRequest:
|
|
||||||
reminders = self._drain_completion_reminders(request.runtime)
|
|
||||||
if not reminders:
|
|
||||||
return request
|
|
||||||
new_messages = [
|
|
||||||
*request.messages,
|
|
||||||
HumanMessage(
|
|
||||||
content=self._format_pending_completion_reminders(reminders),
|
|
||||||
name="todo_completion_reminder",
|
|
||||||
additional_kwargs={"hide_from_ui": True},
|
|
||||||
),
|
|
||||||
]
|
|
||||||
return request.override(messages=new_messages)
|
|
||||||
|
|
||||||
@override
|
|
||||||
def wrap_model_call(
|
|
||||||
self,
|
|
||||||
request: ModelRequest,
|
|
||||||
handler: Callable[[ModelRequest], ModelResponse],
|
|
||||||
) -> ModelCallResult:
|
|
||||||
return handler(self._augment_request(request))
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def awrap_model_call(
|
|
||||||
self,
|
|
||||||
request: ModelRequest,
|
|
||||||
handler: Callable[[ModelRequest], Awaitable[ModelResponse]],
|
|
||||||
) -> ModelCallResult:
|
|
||||||
return await handler(self._augment_request(request))
|
|
||||||
|
|
||||||
@override
|
|
||||||
def after_agent(self, state: ThreadState, runtime: Runtime) -> dict[str, Any] | None:
|
|
||||||
self._clear_current_run_completion_reminders(runtime)
|
|
||||||
return None
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def aafter_agent(self, state: ThreadState, runtime: Runtime) -> dict[str, Any] | None:
|
|
||||||
self._clear_current_run_completion_reminders(runtime)
|
|
||||||
return None
|
|
||||||
|
|||||||
+1
-13
@@ -77,11 +77,9 @@ def _build_runtime_middlewares(
|
|||||||
"""Build shared base middlewares for agent execution."""
|
"""Build shared base middlewares for agent execution."""
|
||||||
from deerflow.agents.middlewares.llm_error_handling_middleware import LLMErrorHandlingMiddleware
|
from deerflow.agents.middlewares.llm_error_handling_middleware import LLMErrorHandlingMiddleware
|
||||||
from deerflow.agents.middlewares.thread_data_middleware import ThreadDataMiddleware
|
from deerflow.agents.middlewares.thread_data_middleware import ThreadDataMiddleware
|
||||||
from deerflow.agents.middlewares.tool_output_budget_middleware import ToolOutputBudgetMiddleware
|
|
||||||
from deerflow.sandbox.middleware import SandboxMiddleware
|
from deerflow.sandbox.middleware import SandboxMiddleware
|
||||||
|
|
||||||
middlewares: list[AgentMiddleware] = [
|
middlewares: list[AgentMiddleware] = [
|
||||||
ToolOutputBudgetMiddleware.from_app_config(app_config),
|
|
||||||
ThreadDataMiddleware(lazy_init=lazy_init),
|
ThreadDataMiddleware(lazy_init=lazy_init),
|
||||||
SandboxMiddleware(lazy_init=lazy_init),
|
SandboxMiddleware(lazy_init=lazy_init),
|
||||||
]
|
]
|
||||||
@@ -89,7 +87,7 @@ def _build_runtime_middlewares(
|
|||||||
if include_uploads:
|
if include_uploads:
|
||||||
from deerflow.agents.middlewares.uploads_middleware import UploadsMiddleware
|
from deerflow.agents.middlewares.uploads_middleware import UploadsMiddleware
|
||||||
|
|
||||||
middlewares.insert(2, UploadsMiddleware())
|
middlewares.insert(1, UploadsMiddleware())
|
||||||
|
|
||||||
if include_dangling_tool_call_patch:
|
if include_dangling_tool_call_patch:
|
||||||
from deerflow.agents.middlewares.dangling_tool_call_middleware import DanglingToolCallMiddleware
|
from deerflow.agents.middlewares.dangling_tool_call_middleware import DanglingToolCallMiddleware
|
||||||
@@ -166,14 +164,4 @@ def build_subagent_runtime_middlewares(
|
|||||||
|
|
||||||
middlewares.append(ViewImageMiddleware())
|
middlewares.append(ViewImageMiddleware())
|
||||||
|
|
||||||
# Same provider safety-termination guard the lead agent uses — subagents
|
|
||||||
# are equally exposed to truncated tool_calls returned with
|
|
||||||
# finish_reason=content_filter (and friends), and the bad call would then
|
|
||||||
# propagate back to the lead agent via the task tool result.
|
|
||||||
safety_config = app_config.safety_finish_reason
|
|
||||||
if safety_config.enabled:
|
|
||||||
from deerflow.agents.middlewares.safety_finish_reason_middleware import SafetyFinishReasonMiddleware
|
|
||||||
|
|
||||||
middlewares.append(SafetyFinishReasonMiddleware.from_config(safety_config))
|
|
||||||
|
|
||||||
return middlewares
|
return middlewares
|
||||||
|
|||||||
-489
@@ -1,489 +0,0 @@
|
|||||||
"""Middleware that enforces a per-result budget on tool outputs.
|
|
||||||
|
|
||||||
Oversized tool results are persisted to disk and replaced with a compact
|
|
||||||
preview containing a file reference. When disk persistence is
|
|
||||||
unavailable the middleware falls back to head+tail truncation so the
|
|
||||||
model context is never blown by a single large tool return.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import logging
|
|
||||||
import os
|
|
||||||
import uuid
|
|
||||||
from collections.abc import Awaitable, Callable
|
|
||||||
from dataclasses import replace as dc_replace
|
|
||||||
from typing import Any, override
|
|
||||||
|
|
||||||
from langchain.agents import AgentState
|
|
||||||
from langchain.agents.middleware import AgentMiddleware
|
|
||||||
from langchain.agents.middleware.types import ModelCallResult, ModelRequest, ModelResponse
|
|
||||||
from langchain_core.messages import ToolMessage
|
|
||||||
from langgraph.prebuilt.tool_node import ToolCallRequest
|
|
||||||
from langgraph.types import Command
|
|
||||||
|
|
||||||
from deerflow.config.tool_output_config import ToolOutputConfig
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
def _default_config() -> ToolOutputConfig:
|
|
||||||
return ToolOutputConfig()
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Text helpers
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
def _message_text(content: Any) -> str | None:
|
|
||||||
"""Extract a plain-text representation from a ToolMessage content field.
|
|
||||||
|
|
||||||
Returns ``None`` for non-string / multimodal content so the caller
|
|
||||||
can skip budget enforcement (images, structured blocks, etc.).
|
|
||||||
"""
|
|
||||||
if isinstance(content, str):
|
|
||||||
return content
|
|
||||||
if content is None:
|
|
||||||
return None
|
|
||||||
if isinstance(content, list):
|
|
||||||
pieces: list[str] = []
|
|
||||||
for part in content:
|
|
||||||
if isinstance(part, str):
|
|
||||||
pieces.append(part)
|
|
||||||
elif isinstance(part, dict) and isinstance(part.get("text"), str):
|
|
||||||
pieces.append(part["text"])
|
|
||||||
else:
|
|
||||||
return None
|
|
||||||
return "\n".join(pieces) if pieces else None
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def _snap_to_line_boundary(text: str, pos: int) -> int:
|
|
||||||
"""Return *pos* or the nearest preceding newline+1, whichever is closer.
|
|
||||||
|
|
||||||
Used so that previews and truncations end on a complete line when
|
|
||||||
possible. If no newline exists in the second half of ``text[:pos]``
|
|
||||||
the original *pos* is returned unchanged.
|
|
||||||
"""
|
|
||||||
if pos <= 0 or pos >= len(text):
|
|
||||||
return pos
|
|
||||||
half = pos // 2
|
|
||||||
nl = text.rfind("\n", half, pos)
|
|
||||||
if nl >= 0:
|
|
||||||
return nl + 1
|
|
||||||
return pos
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Disk persistence
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
_EXT_MAP: dict[str, str] = {
|
|
||||||
"bash": "log",
|
|
||||||
"bash_tool": "log",
|
|
||||||
"web_fetch": "log",
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def _sanitize_tool_name(name: str) -> str:
|
|
||||||
"""Strip path separators and traversal components from a tool name."""
|
|
||||||
base = os.path.basename(name)
|
|
||||||
safe = base.replace("..", "").replace("/", "_").replace("\\", "_")
|
|
||||||
return safe or "unknown"
|
|
||||||
|
|
||||||
|
|
||||||
def _externalize(
|
|
||||||
content: str,
|
|
||||||
*,
|
|
||||||
tool_name: str,
|
|
||||||
tool_call_id: str,
|
|
||||||
outputs_path: str,
|
|
||||||
storage_subdir: str,
|
|
||||||
) -> str | None:
|
|
||||||
"""Write *content* to disk and return the virtual path, or ``None`` on failure."""
|
|
||||||
if os.path.isabs(storage_subdir) or ".." in storage_subdir:
|
|
||||||
return None
|
|
||||||
storage_dir = os.path.join(outputs_path, storage_subdir)
|
|
||||||
try:
|
|
||||||
os.makedirs(storage_dir, exist_ok=True)
|
|
||||||
except OSError:
|
|
||||||
return None
|
|
||||||
|
|
||||||
safe_name = _sanitize_tool_name(tool_name)
|
|
||||||
ext = _EXT_MAP.get(tool_name, "txt")
|
|
||||||
short_id = uuid.uuid4().hex[:12]
|
|
||||||
filename = f"{safe_name}-{short_id}.{ext}"
|
|
||||||
filepath = os.path.join(storage_dir, filename)
|
|
||||||
|
|
||||||
if not os.path.abspath(filepath).startswith(os.path.abspath(storage_dir)):
|
|
||||||
return None
|
|
||||||
|
|
||||||
try:
|
|
||||||
with open(filepath, "w", encoding="utf-8") as f:
|
|
||||||
f.write(content)
|
|
||||||
except OSError:
|
|
||||||
return None
|
|
||||||
|
|
||||||
virtual_base = "/mnt/user-data/outputs"
|
|
||||||
return f"{virtual_base}/{storage_subdir}/{filename}"
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Preview / fallback builders
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
def _build_preview(
|
|
||||||
content: str,
|
|
||||||
*,
|
|
||||||
tool_name: str,
|
|
||||||
virtual_path: str,
|
|
||||||
head_chars: int,
|
|
||||||
tail_chars: int,
|
|
||||||
) -> str:
|
|
||||||
"""Build a preview with a file reference for externalized output."""
|
|
||||||
total = len(content)
|
|
||||||
head_end = _snap_to_line_boundary(content, min(head_chars, total))
|
|
||||||
tail_start = max(head_end, total - tail_chars)
|
|
||||||
tail_start_snapped = _snap_to_line_boundary(content, tail_start)
|
|
||||||
if tail_start_snapped > head_end:
|
|
||||||
tail_start = tail_start_snapped
|
|
||||||
|
|
||||||
head = content[:head_end]
|
|
||||||
tail = content[tail_start:] if tail_start < total else ""
|
|
||||||
|
|
||||||
omitted = total - len(head) - len(tail)
|
|
||||||
ref = f"\n\n[Full {tool_name} output saved to {virtual_path} ({total} chars, ~{total // 4} tokens). Use read_file with start_line and end_line to access specific sections. {omitted} chars omitted from this preview.]\n\n"
|
|
||||||
|
|
||||||
parts = [head, ref]
|
|
||||||
if tail:
|
|
||||||
parts.append(tail)
|
|
||||||
return "".join(parts)
|
|
||||||
|
|
||||||
|
|
||||||
def _build_fallback(
|
|
||||||
content: str,
|
|
||||||
*,
|
|
||||||
tool_name: str,
|
|
||||||
max_chars: int,
|
|
||||||
head_chars: int,
|
|
||||||
tail_chars: int,
|
|
||||||
) -> str:
|
|
||||||
"""Build a head+tail truncation when disk persistence is unavailable.
|
|
||||||
|
|
||||||
The returned string is guaranteed to be no longer than *max_chars*.
|
|
||||||
"""
|
|
||||||
total = len(content)
|
|
||||||
if max_chars <= 0 or total <= max_chars:
|
|
||||||
return content
|
|
||||||
|
|
||||||
marker_template = "\n\n[... {n} chars omitted from {tn} output. Persistent storage unavailable. Consider narrowing the query or using more specific parameters.]\n\n"
|
|
||||||
marker_overhead = len(marker_template.format(n=total, tn=tool_name))
|
|
||||||
|
|
||||||
if marker_overhead >= max_chars:
|
|
||||||
return content[:max_chars]
|
|
||||||
|
|
||||||
budget = max_chars - marker_overhead
|
|
||||||
effective_head = min(head_chars, budget)
|
|
||||||
effective_tail = min(tail_chars, max(0, budget - effective_head))
|
|
||||||
|
|
||||||
head_end = _snap_to_line_boundary(content, min(effective_head, total))
|
|
||||||
tail_start = max(head_end, total - effective_tail)
|
|
||||||
tail_start_snapped = _snap_to_line_boundary(content, tail_start)
|
|
||||||
if tail_start_snapped > head_end:
|
|
||||||
tail_start = tail_start_snapped
|
|
||||||
|
|
||||||
head = content[:head_end]
|
|
||||||
tail = content[tail_start:] if tail_start < total else ""
|
|
||||||
omitted = total - len(head) - len(tail)
|
|
||||||
|
|
||||||
marker = marker_template.format(n=omitted, tn=tool_name)
|
|
||||||
|
|
||||||
parts = [head, marker]
|
|
||||||
if tail:
|
|
||||||
parts.append(tail)
|
|
||||||
return "".join(parts)
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Core budget logic
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
def _resolve_outputs_path(request: ToolCallRequest) -> str | None:
|
|
||||||
"""Best-effort extraction of the thread outputs path."""
|
|
||||||
runtime = getattr(request, "runtime", None)
|
|
||||||
if runtime is None:
|
|
||||||
return None
|
|
||||||
state = getattr(runtime, "state", None)
|
|
||||||
if state is None:
|
|
||||||
return None
|
|
||||||
thread_data = state.get("thread_data")
|
|
||||||
if not isinstance(thread_data, dict):
|
|
||||||
return None
|
|
||||||
outputs_path = thread_data.get("outputs_path")
|
|
||||||
return outputs_path if isinstance(outputs_path, str) else None
|
|
||||||
|
|
||||||
|
|
||||||
def _budget_content(
|
|
||||||
content: str,
|
|
||||||
*,
|
|
||||||
tool_name: str,
|
|
||||||
tool_call_id: str,
|
|
||||||
outputs_path: str | None,
|
|
||||||
config: ToolOutputConfig,
|
|
||||||
) -> str | None:
|
|
||||||
"""Apply budget to *content*. Returns ``None`` if no change needed."""
|
|
||||||
threshold = config.tool_overrides.get(tool_name, config.externalize_min_chars)
|
|
||||||
if threshold <= 0 and config.fallback_max_chars <= 0:
|
|
||||||
return None
|
|
||||||
if len(content) <= threshold and len(content) <= config.fallback_max_chars:
|
|
||||||
return None
|
|
||||||
|
|
||||||
if threshold > 0 and len(content) > threshold and outputs_path:
|
|
||||||
virtual_path = _externalize(
|
|
||||||
content,
|
|
||||||
tool_name=tool_name,
|
|
||||||
tool_call_id=tool_call_id,
|
|
||||||
outputs_path=outputs_path,
|
|
||||||
storage_subdir=config.storage_subdir,
|
|
||||||
)
|
|
||||||
if virtual_path is not None:
|
|
||||||
logger.info(
|
|
||||||
"Externalized %s output (%d chars) to %s",
|
|
||||||
tool_name,
|
|
||||||
len(content),
|
|
||||||
virtual_path,
|
|
||||||
)
|
|
||||||
return _build_preview(
|
|
||||||
content,
|
|
||||||
tool_name=tool_name,
|
|
||||||
virtual_path=virtual_path,
|
|
||||||
head_chars=config.preview_head_chars,
|
|
||||||
tail_chars=config.preview_tail_chars,
|
|
||||||
)
|
|
||||||
|
|
||||||
if config.fallback_max_chars > 0 and len(content) > config.fallback_max_chars:
|
|
||||||
logger.warning(
|
|
||||||
"Fallback-truncating %s output: %d chars → %d max",
|
|
||||||
tool_name,
|
|
||||||
len(content),
|
|
||||||
config.fallback_max_chars,
|
|
||||||
)
|
|
||||||
return _build_fallback(
|
|
||||||
content,
|
|
||||||
tool_name=tool_name,
|
|
||||||
max_chars=config.fallback_max_chars,
|
|
||||||
head_chars=config.fallback_head_chars,
|
|
||||||
tail_chars=config.fallback_tail_chars,
|
|
||||||
)
|
|
||||||
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Result patchers
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
def _patch_tool_message(msg: ToolMessage, config: ToolOutputConfig, outputs_path: str | None) -> ToolMessage:
|
|
||||||
"""Apply budget to a single ToolMessage. Returns the original if unchanged."""
|
|
||||||
tool_name = msg.name or "unknown"
|
|
||||||
if tool_name in config.exempt_tools:
|
|
||||||
return msg
|
|
||||||
|
|
||||||
text = _message_text(msg.content)
|
|
||||||
if text is None:
|
|
||||||
return msg
|
|
||||||
|
|
||||||
replacement = _budget_content(
|
|
||||||
text,
|
|
||||||
tool_name=tool_name,
|
|
||||||
tool_call_id=msg.tool_call_id or "",
|
|
||||||
outputs_path=outputs_path,
|
|
||||||
config=config,
|
|
||||||
)
|
|
||||||
if replacement is None:
|
|
||||||
return msg
|
|
||||||
|
|
||||||
update: dict[str, Any] = {"content": replacement}
|
|
||||||
if getattr(msg, "response_metadata", None):
|
|
||||||
update["response_metadata"] = dict(msg.response_metadata)
|
|
||||||
if getattr(msg, "additional_kwargs", None):
|
|
||||||
update["additional_kwargs"] = dict(msg.additional_kwargs)
|
|
||||||
return msg.model_copy(update=update)
|
|
||||||
|
|
||||||
|
|
||||||
def _effective_trigger(tool_name: str, config: ToolOutputConfig) -> int:
|
|
||||||
"""Smallest content length that could trigger budgeting for *tool_name*.
|
|
||||||
|
|
||||||
Mirrors the trigger conditions in :func:`_budget_content` (per-tool
|
|
||||||
externalize threshold OR global fallback), so the pre-scan never produces
|
|
||||||
a false negative. Returns ``-1`` when nothing could ever trigger.
|
|
||||||
"""
|
|
||||||
candidates: list[int] = []
|
|
||||||
externalize = config.tool_overrides.get(tool_name, config.externalize_min_chars)
|
|
||||||
if externalize > 0:
|
|
||||||
candidates.append(externalize)
|
|
||||||
if config.fallback_max_chars > 0:
|
|
||||||
candidates.append(config.fallback_max_chars)
|
|
||||||
return min(candidates) if candidates else -1
|
|
||||||
|
|
||||||
|
|
||||||
def _tool_message_over_budget(msg: ToolMessage, config: ToolOutputConfig) -> bool:
|
|
||||||
"""Cheap, per-tool-aware check: is this ToolMessage non-exempt and over its trigger?"""
|
|
||||||
if (msg.name or "") in config.exempt_tools:
|
|
||||||
return False
|
|
||||||
trigger = _effective_trigger(msg.name or "", config)
|
|
||||||
if trigger < 0:
|
|
||||||
return False
|
|
||||||
text = _message_text(msg.content)
|
|
||||||
return text is not None and len(text) > trigger
|
|
||||||
|
|
||||||
|
|
||||||
def _needs_budget(result: ToolMessage | Command, config: ToolOutputConfig) -> bool:
|
|
||||||
"""Fast check whether *result* could need budgeting (avoids thread offload for small outputs)."""
|
|
||||||
if isinstance(result, ToolMessage):
|
|
||||||
return _tool_message_over_budget(result, config)
|
|
||||||
update = getattr(result, "update", None)
|
|
||||||
if isinstance(update, dict):
|
|
||||||
for msg in update.get("messages", []):
|
|
||||||
if isinstance(msg, ToolMessage) and _tool_message_over_budget(msg, config):
|
|
||||||
return True
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
def _patch_result(result: ToolMessage | Command, config: ToolOutputConfig, outputs_path: str | None) -> ToolMessage | Command:
|
|
||||||
"""Apply budget to a tool call result (ToolMessage or Command)."""
|
|
||||||
if isinstance(result, ToolMessage):
|
|
||||||
return _patch_tool_message(result, config, outputs_path)
|
|
||||||
|
|
||||||
update = getattr(result, "update", None)
|
|
||||||
if not isinstance(update, dict):
|
|
||||||
return result
|
|
||||||
|
|
||||||
messages = update.get("messages")
|
|
||||||
if not isinstance(messages, list):
|
|
||||||
return result
|
|
||||||
|
|
||||||
new_messages: list[Any] = []
|
|
||||||
changed = False
|
|
||||||
for msg in messages:
|
|
||||||
if isinstance(msg, ToolMessage):
|
|
||||||
patched = _patch_tool_message(msg, config, outputs_path)
|
|
||||||
if patched is not msg:
|
|
||||||
changed = True
|
|
||||||
new_messages.append(patched)
|
|
||||||
else:
|
|
||||||
new_messages.append(msg)
|
|
||||||
|
|
||||||
if not changed:
|
|
||||||
return result
|
|
||||||
|
|
||||||
return dc_replace(result, update={**update, "messages": new_messages})
|
|
||||||
|
|
||||||
|
|
||||||
def _patch_model_messages(messages: list[Any], config: ToolOutputConfig) -> list[Any] | None:
|
|
||||||
"""Apply budget to historical ToolMessages in a model request. Returns ``None`` if unchanged.
|
|
||||||
|
|
||||||
A cheap pre-scan bails out before allocating a new list when no historical
|
|
||||||
ToolMessage exceeds the budget — the common case once every result has
|
|
||||||
already been budgeted at tool-call time, so a long history is not rebuilt
|
|
||||||
on every model call.
|
|
||||||
"""
|
|
||||||
if not any(isinstance(msg, ToolMessage) and _tool_message_over_budget(msg, config) for msg in messages):
|
|
||||||
return None
|
|
||||||
|
|
||||||
updated: list[Any] = []
|
|
||||||
changed = False
|
|
||||||
for msg in messages:
|
|
||||||
if isinstance(msg, ToolMessage):
|
|
||||||
patched = _patch_tool_message(msg, config, outputs_path=None)
|
|
||||||
if patched is not msg:
|
|
||||||
changed = True
|
|
||||||
updated.append(patched)
|
|
||||||
else:
|
|
||||||
updated.append(msg)
|
|
||||||
return updated if changed else None
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Middleware class
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class ToolOutputBudgetMiddleware(AgentMiddleware[AgentState]):
|
|
||||||
"""Enforce per-result budget on tool outputs via externalization or truncation."""
|
|
||||||
|
|
||||||
def __init__(self, config: ToolOutputConfig | None = None) -> None:
|
|
||||||
super().__init__()
|
|
||||||
self._config = config if config is not None else _default_config()
|
|
||||||
|
|
||||||
@classmethod
|
|
||||||
def from_app_config(cls, app_config: Any) -> ToolOutputBudgetMiddleware:
|
|
||||||
tool_output = getattr(app_config, "tool_output", None)
|
|
||||||
if isinstance(tool_output, ToolOutputConfig):
|
|
||||||
return cls(config=tool_output)
|
|
||||||
return cls()
|
|
||||||
|
|
||||||
# -- tool call hooks ---------------------------------------------------
|
|
||||||
|
|
||||||
@override
|
|
||||||
def wrap_tool_call(
|
|
||||||
self,
|
|
||||||
request: ToolCallRequest,
|
|
||||||
handler: Callable[[ToolCallRequest], ToolMessage | Command],
|
|
||||||
) -> ToolMessage | Command:
|
|
||||||
result = handler(request)
|
|
||||||
if not self._config.enabled:
|
|
||||||
return result
|
|
||||||
if not _needs_budget(result, self._config):
|
|
||||||
return result
|
|
||||||
outputs_path = _resolve_outputs_path(request)
|
|
||||||
return _patch_result(result, self._config, outputs_path)
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def awrap_tool_call(
|
|
||||||
self,
|
|
||||||
request: ToolCallRequest,
|
|
||||||
handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]],
|
|
||||||
) -> ToolMessage | Command:
|
|
||||||
result = await handler(request)
|
|
||||||
if not self._config.enabled:
|
|
||||||
return result
|
|
||||||
if not _needs_budget(result, self._config):
|
|
||||||
return result
|
|
||||||
outputs_path = _resolve_outputs_path(request)
|
|
||||||
return await asyncio.to_thread(_patch_result, result, self._config, outputs_path)
|
|
||||||
|
|
||||||
# -- model call hooks (historical message truncation) ------------------
|
|
||||||
|
|
||||||
@override
|
|
||||||
def wrap_model_call(
|
|
||||||
self,
|
|
||||||
request: ModelRequest,
|
|
||||||
handler: Callable[[ModelRequest], ModelResponse],
|
|
||||||
) -> ModelCallResult:
|
|
||||||
if self._config.enabled:
|
|
||||||
messages = getattr(request, "messages", None)
|
|
||||||
if isinstance(messages, list):
|
|
||||||
patched = _patch_model_messages(messages, self._config)
|
|
||||||
if patched is not None:
|
|
||||||
request = request.override(messages=patched)
|
|
||||||
return handler(request)
|
|
||||||
|
|
||||||
@override
|
|
||||||
async def awrap_model_call(
|
|
||||||
self,
|
|
||||||
request: ModelRequest,
|
|
||||||
handler: Callable[[ModelRequest], Awaitable[ModelResponse]],
|
|
||||||
) -> ModelCallResult:
|
|
||||||
if self._config.enabled:
|
|
||||||
messages = getattr(request, "messages", None)
|
|
||||||
if isinstance(messages, list):
|
|
||||||
patched = _patch_model_messages(messages, self._config)
|
|
||||||
if patched is not None:
|
|
||||||
request = request.override(messages=patched)
|
|
||||||
return await handler(request)
|
|
||||||
@@ -7,7 +7,6 @@ from typing import NotRequired, override
|
|||||||
from langchain.agents import AgentState
|
from langchain.agents import AgentState
|
||||||
from langchain.agents.middleware import AgentMiddleware
|
from langchain.agents.middleware import AgentMiddleware
|
||||||
from langchain_core.messages import HumanMessage
|
from langchain_core.messages import HumanMessage
|
||||||
from langchain_core.runnables import run_in_executor
|
|
||||||
from langgraph.runtime import Runtime
|
from langgraph.runtime import Runtime
|
||||||
|
|
||||||
from deerflow.config.paths import Paths, get_paths
|
from deerflow.config.paths import Paths, get_paths
|
||||||
@@ -294,16 +293,3 @@ class UploadsMiddleware(AgentMiddleware[UploadsMiddlewareState]):
|
|||||||
"uploaded_files": new_files,
|
"uploaded_files": new_files,
|
||||||
"messages": messages,
|
"messages": messages,
|
||||||
}
|
}
|
||||||
|
|
||||||
@override
|
|
||||||
async def abefore_agent(self, state: UploadsMiddlewareState, runtime: Runtime) -> dict | None:
|
|
||||||
"""Async hook that offloads the synchronous uploads scan off the event loop.
|
|
||||||
|
|
||||||
``before_agent`` performs blocking filesystem IO (directory enumeration,
|
|
||||||
``stat``, reading sibling ``.md`` outlines). When the graph runs async,
|
|
||||||
langgraph would otherwise execute the sync hook directly on the event
|
|
||||||
loop, so it is dispatched to a worker thread via ``run_in_executor``.
|
|
||||||
``run_in_executor`` copies the current context, so the ``user_id``
|
|
||||||
contextvar read by ``get_effective_user_id()`` is preserved.
|
|
||||||
"""
|
|
||||||
return await run_in_executor(None, self.before_agent, state, runtime)
|
|
||||||
|
|||||||
@@ -45,24 +45,11 @@ def merge_viewed_images(existing: dict[str, ViewedImageData] | None, new: dict[s
|
|||||||
return {**existing, **new}
|
return {**existing, **new}
|
||||||
|
|
||||||
|
|
||||||
def merge_todos(existing: list | None, new: list | None) -> list | None:
|
|
||||||
"""Reducer for todos list - keeps the last non-None value.
|
|
||||||
|
|
||||||
Semantics:
|
|
||||||
- If `new` is None (node didn't touch todos), preserve `existing`.
|
|
||||||
- If `new` is provided (even empty list), it represents an explicit
|
|
||||||
update and wins over `existing`.
|
|
||||||
"""
|
|
||||||
if new is None:
|
|
||||||
return existing
|
|
||||||
return new
|
|
||||||
|
|
||||||
|
|
||||||
class ThreadState(AgentState):
|
class ThreadState(AgentState):
|
||||||
sandbox: NotRequired[SandboxState | None]
|
sandbox: NotRequired[SandboxState | None]
|
||||||
thread_data: NotRequired[ThreadDataState | None]
|
thread_data: NotRequired[ThreadDataState | None]
|
||||||
title: NotRequired[str | None]
|
title: NotRequired[str | None]
|
||||||
artifacts: Annotated[list[str], merge_artifacts]
|
artifacts: Annotated[list[str], merge_artifacts]
|
||||||
todos: Annotated[list | None, merge_todos]
|
todos: NotRequired[list | None]
|
||||||
uploaded_files: NotRequired[list[dict] | None]
|
uploaded_files: NotRequired[list[dict] | None]
|
||||||
viewed_images: Annotated[dict[str, ViewedImageData], merge_viewed_images] # image_path -> {base64, mime_type}
|
viewed_images: Annotated[dict[str, ViewedImageData], merge_viewed_images] # image_path -> {base64, mime_type}
|
||||||
|
|||||||
@@ -19,7 +19,6 @@ import asyncio
|
|||||||
import json
|
import json
|
||||||
import logging
|
import logging
|
||||||
import mimetypes
|
import mimetypes
|
||||||
import os
|
|
||||||
import shutil
|
import shutil
|
||||||
import tempfile
|
import tempfile
|
||||||
import uuid
|
import uuid
|
||||||
@@ -43,7 +42,6 @@ from deerflow.config.paths import get_paths
|
|||||||
from deerflow.models import create_chat_model
|
from deerflow.models import create_chat_model
|
||||||
from deerflow.runtime.user_context import get_effective_user_id
|
from deerflow.runtime.user_context import get_effective_user_id
|
||||||
from deerflow.skills.storage import get_or_new_skill_storage
|
from deerflow.skills.storage import get_or_new_skill_storage
|
||||||
from deerflow.tracing import build_tracing_callbacks, inject_langfuse_metadata
|
|
||||||
from deerflow.uploads.manager import (
|
from deerflow.uploads.manager import (
|
||||||
claim_unique_filename,
|
claim_unique_filename,
|
||||||
delete_file_safe,
|
delete_file_safe,
|
||||||
@@ -125,7 +123,6 @@ class DeerFlowClient:
|
|||||||
agent_name: str | None = None,
|
agent_name: str | None = None,
|
||||||
available_skills: set[str] | None = None,
|
available_skills: set[str] | None = None,
|
||||||
middlewares: Sequence[AgentMiddleware] | None = None,
|
middlewares: Sequence[AgentMiddleware] | None = None,
|
||||||
environment: str | None = None,
|
|
||||||
):
|
):
|
||||||
"""Initialize the client.
|
"""Initialize the client.
|
||||||
|
|
||||||
@@ -143,12 +140,6 @@ class DeerFlowClient:
|
|||||||
agent_name: Name of the agent to use.
|
agent_name: Name of the agent to use.
|
||||||
available_skills: Optional set of skill names to make available. If None (default), all scanned skills are available.
|
available_skills: Optional set of skill names to make available. If None (default), all scanned skills are available.
|
||||||
middlewares: Optional list of custom middlewares to inject into the agent.
|
middlewares: Optional list of custom middlewares to inject into the agent.
|
||||||
environment: Deployment environment label that ends up in
|
|
||||||
``langfuse_tags`` (e.g. ``"production"`` / ``"staging"``).
|
|
||||||
When ``None`` the worker/client falls back to the
|
|
||||||
``DEER_FLOW_ENV`` or ``ENVIRONMENT`` env vars. Pass an
|
|
||||||
explicit value for programmatic callers that do not want
|
|
||||||
env-var coupling.
|
|
||||||
"""
|
"""
|
||||||
if config_path is not None:
|
if config_path is not None:
|
||||||
reload_app_config(config_path)
|
reload_app_config(config_path)
|
||||||
@@ -165,7 +156,6 @@ class DeerFlowClient:
|
|||||||
self._agent_name = agent_name
|
self._agent_name = agent_name
|
||||||
self._available_skills = set(available_skills) if available_skills is not None else None
|
self._available_skills = set(available_skills) if available_skills is not None else None
|
||||||
self._middlewares = list(middlewares) if middlewares else []
|
self._middlewares = list(middlewares) if middlewares else []
|
||||||
self._environment = environment
|
|
||||||
|
|
||||||
# Lazy agent — created on first call, recreated when config changes.
|
# Lazy agent — created on first call, recreated when config changes.
|
||||||
self._agent = None
|
self._agent = None
|
||||||
@@ -238,11 +228,7 @@ class DeerFlowClient:
|
|||||||
max_concurrent_subagents = cfg.get("max_concurrent_subagents", 3)
|
max_concurrent_subagents = cfg.get("max_concurrent_subagents", 3)
|
||||||
|
|
||||||
kwargs: dict[str, Any] = {
|
kwargs: dict[str, Any] = {
|
||||||
# attach_tracing=False because ``stream()`` injects tracing
|
"model": create_chat_model(name=model_name, thinking_enabled=thinking_enabled),
|
||||||
# callbacks at the graph invocation root so a single embedded run
|
|
||||||
# produces one trace with correct session_id / user_id propagation.
|
|
||||||
# Attaching them again on the model would emit duplicate spans.
|
|
||||||
"model": create_chat_model(name=model_name, thinking_enabled=thinking_enabled, attach_tracing=False),
|
|
||||||
"tools": self._get_tools(model_name=model_name, subagent_enabled=subagent_enabled),
|
"tools": self._get_tools(model_name=model_name, subagent_enabled=subagent_enabled),
|
||||||
"middleware": _build_middlewares(config, model_name=model_name, agent_name=self._agent_name, custom_middlewares=self._middlewares),
|
"middleware": _build_middlewares(config, model_name=model_name, agent_name=self._agent_name, custom_middlewares=self._middlewares),
|
||||||
"system_prompt": apply_prompt_template(
|
"system_prompt": apply_prompt_template(
|
||||||
@@ -585,28 +571,6 @@ class DeerFlowClient:
|
|||||||
thread_id = str(uuid.uuid4())
|
thread_id = str(uuid.uuid4())
|
||||||
|
|
||||||
config = self._get_runnable_config(thread_id, **kwargs)
|
config = self._get_runnable_config(thread_id, **kwargs)
|
||||||
|
|
||||||
# Inject tracing callbacks and Langfuse trace metadata at the graph
|
|
||||||
# invocation root so the embedded client matches the gateway worker's
|
|
||||||
# behaviour: a single ``stream()`` produces one trace with all node /
|
|
||||||
# LLM / tool calls nested under it, and the trace carries the reserved
|
|
||||||
# ``langfuse_session_id`` / ``langfuse_user_id`` keys that the Langfuse
|
|
||||||
# CallbackHandler lifts onto the root trace's ``sessionId`` / ``userId``.
|
|
||||||
tracing_callbacks = build_tracing_callbacks()
|
|
||||||
if tracing_callbacks:
|
|
||||||
existing_callbacks = list(config.get("callbacks") or [])
|
|
||||||
config["callbacks"] = [*existing_callbacks, *tracing_callbacks]
|
|
||||||
|
|
||||||
configurable = config.get("configurable") or {}
|
|
||||||
inject_langfuse_metadata(
|
|
||||||
config,
|
|
||||||
thread_id=thread_id,
|
|
||||||
user_id=get_effective_user_id(),
|
|
||||||
assistant_id=self._agent_name or "lead-agent",
|
|
||||||
model_name=configurable.get("model_name") or self._model_name,
|
|
||||||
environment=self._environment or os.environ.get("DEER_FLOW_ENV") or os.environ.get("ENVIRONMENT"),
|
|
||||||
)
|
|
||||||
|
|
||||||
self._ensure_agent(config)
|
self._ensure_agent(config)
|
||||||
|
|
||||||
state: dict[str, Any] = {"messages": [HumanMessage(content=message)]}
|
state: dict[str, Any] = {"messages": [HumanMessage(content=message)]}
|
||||||
|
|||||||
@@ -1,5 +1,4 @@
|
|||||||
import base64
|
import base64
|
||||||
import errno
|
|
||||||
import logging
|
import logging
|
||||||
import shlex
|
import shlex
|
||||||
import threading
|
import threading
|
||||||
@@ -7,14 +6,11 @@ import uuid
|
|||||||
|
|
||||||
from agent_sandbox import Sandbox as AioSandboxClient
|
from agent_sandbox import Sandbox as AioSandboxClient
|
||||||
|
|
||||||
from deerflow.config.paths import VIRTUAL_PATH_PREFIX
|
|
||||||
from deerflow.sandbox.sandbox import Sandbox
|
from deerflow.sandbox.sandbox import Sandbox
|
||||||
from deerflow.sandbox.search import GrepMatch, path_matches, should_ignore_path, truncate_line
|
from deerflow.sandbox.search import GrepMatch, path_matches, should_ignore_path, truncate_line
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
_MAX_DOWNLOAD_SIZE = 100 * 1024 * 1024 # 100 MB
|
|
||||||
|
|
||||||
_ERROR_OBSERVATION_SIGNATURE = "'ErrorObservation' object has no attribute 'exit_code'"
|
_ERROR_OBSERVATION_SIGNATURE = "'ErrorObservation' object has no attribute 'exit_code'"
|
||||||
|
|
||||||
|
|
||||||
@@ -106,49 +102,6 @@ class AioSandbox(Sandbox):
|
|||||||
logger.error(f"Failed to read file in sandbox: {e}")
|
logger.error(f"Failed to read file in sandbox: {e}")
|
||||||
return f"Error: {e}"
|
return f"Error: {e}"
|
||||||
|
|
||||||
def download_file(self, path: str) -> bytes:
|
|
||||||
"""Download file bytes from the sandbox.
|
|
||||||
|
|
||||||
Raises:
|
|
||||||
PermissionError: If the path contains '..' traversal segments or is
|
|
||||||
outside ``VIRTUAL_PATH_PREFIX``.
|
|
||||||
OSError: If the file cannot be retrieved from the sandbox.
|
|
||||||
"""
|
|
||||||
# Reject path traversal before sending to the container API.
|
|
||||||
# LocalSandbox gets this implicitly via _resolve_path;
|
|
||||||
# here the path is forwarded verbatim so we must check explicitly.
|
|
||||||
normalised = path.replace("\\", "/")
|
|
||||||
for segment in normalised.split("/"):
|
|
||||||
if segment == "..":
|
|
||||||
logger.error(f"Refused download due to path traversal: {path}")
|
|
||||||
raise PermissionError(f"Access denied: path traversal detected in '{path}'")
|
|
||||||
|
|
||||||
stripped_path = normalised.lstrip("/")
|
|
||||||
allowed_prefix = VIRTUAL_PATH_PREFIX.lstrip("/")
|
|
||||||
if stripped_path != allowed_prefix and not stripped_path.startswith(f"{allowed_prefix}/"):
|
|
||||||
logger.error("Refused download outside allowed directory: path=%s, allowed_prefix=%s", path, VIRTUAL_PATH_PREFIX)
|
|
||||||
raise PermissionError(f"Access denied: path must be under '{VIRTUAL_PATH_PREFIX}': '{path}'")
|
|
||||||
|
|
||||||
with self._lock:
|
|
||||||
try:
|
|
||||||
chunks: list[bytes] = []
|
|
||||||
total = 0
|
|
||||||
for chunk in self._client.file.download_file(path=path):
|
|
||||||
total += len(chunk)
|
|
||||||
if total > _MAX_DOWNLOAD_SIZE:
|
|
||||||
raise OSError(
|
|
||||||
errno.EFBIG,
|
|
||||||
f"File exceeds maximum download size of {_MAX_DOWNLOAD_SIZE} bytes",
|
|
||||||
path,
|
|
||||||
)
|
|
||||||
chunks.append(chunk)
|
|
||||||
return b"".join(chunks)
|
|
||||||
except OSError:
|
|
||||||
raise
|
|
||||||
except Exception as e:
|
|
||||||
logger.error(f"Failed to download file in sandbox: {e}")
|
|
||||||
raise OSError(f"Failed to download file '{path}' from sandbox: {e}") from e
|
|
||||||
|
|
||||||
def list_dir(self, path: str, max_depth: int = 2) -> list[str]:
|
def list_dir(self, path: str, max_depth: int = 2) -> list[str]:
|
||||||
"""List the contents of a directory in the sandbox.
|
"""List the contents of a directory in the sandbox.
|
||||||
|
|
||||||
|
|||||||
@@ -10,7 +10,6 @@ The provider itself handles:
|
|||||||
- Mount computation (thread-specific, skills)
|
- Mount computation (thread-specific, skills)
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import atexit
|
import atexit
|
||||||
import hashlib
|
import hashlib
|
||||||
import logging
|
import logging
|
||||||
@@ -19,7 +18,6 @@ import signal
|
|||||||
import threading
|
import threading
|
||||||
import time
|
import time
|
||||||
import uuid
|
import uuid
|
||||||
from concurrent.futures import ThreadPoolExecutor
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
import fcntl
|
import fcntl
|
||||||
@@ -34,7 +32,7 @@ from deerflow.sandbox.sandbox import Sandbox
|
|||||||
from deerflow.sandbox.sandbox_provider import SandboxProvider
|
from deerflow.sandbox.sandbox_provider import SandboxProvider
|
||||||
|
|
||||||
from .aio_sandbox import AioSandbox
|
from .aio_sandbox import AioSandbox
|
||||||
from .backend import SandboxBackend, wait_for_sandbox_ready, wait_for_sandbox_ready_async
|
from .backend import SandboxBackend, wait_for_sandbox_ready
|
||||||
from .local_backend import LocalContainerBackend
|
from .local_backend import LocalContainerBackend
|
||||||
from .remote_backend import RemoteSandboxBackend
|
from .remote_backend import RemoteSandboxBackend
|
||||||
from .sandbox_info import SandboxInfo
|
from .sandbox_info import SandboxInfo
|
||||||
@@ -48,9 +46,6 @@ DEFAULT_CONTAINER_PREFIX = "deer-flow-sandbox"
|
|||||||
DEFAULT_IDLE_TIMEOUT = 600 # 10 minutes in seconds
|
DEFAULT_IDLE_TIMEOUT = 600 # 10 minutes in seconds
|
||||||
DEFAULT_REPLICAS = 3 # Maximum concurrent sandbox containers
|
DEFAULT_REPLICAS = 3 # Maximum concurrent sandbox containers
|
||||||
IDLE_CHECK_INTERVAL = 60 # Check every 60 seconds
|
IDLE_CHECK_INTERVAL = 60 # Check every 60 seconds
|
||||||
THREAD_LOCK_EXECUTOR_WORKERS = min(32, (os.cpu_count() or 1) + 4)
|
|
||||||
_THREAD_LOCK_EXECUTOR = ThreadPoolExecutor(max_workers=THREAD_LOCK_EXECUTOR_WORKERS, thread_name_prefix="sandbox-lock-wait")
|
|
||||||
atexit.register(_THREAD_LOCK_EXECUTOR.shutdown, wait=False, cancel_futures=True)
|
|
||||||
|
|
||||||
|
|
||||||
def _lock_file_exclusive(lock_file) -> None:
|
def _lock_file_exclusive(lock_file) -> None:
|
||||||
@@ -71,40 +66,6 @@ def _unlock_file(lock_file) -> None:
|
|||||||
msvcrt.locking(lock_file.fileno(), msvcrt.LK_UNLCK, 1)
|
msvcrt.locking(lock_file.fileno(), msvcrt.LK_UNLCK, 1)
|
||||||
|
|
||||||
|
|
||||||
def _open_lock_file(lock_path):
|
|
||||||
return open(lock_path, "a", encoding="utf-8")
|
|
||||||
|
|
||||||
|
|
||||||
async def _acquire_thread_lock_async(lock: threading.Lock) -> None:
|
|
||||||
"""Acquire a threading.Lock without polling or using the default executor."""
|
|
||||||
loop = asyncio.get_running_loop()
|
|
||||||
acquire_future = loop.run_in_executor(_THREAD_LOCK_EXECUTOR, lock.acquire, True)
|
|
||||||
|
|
||||||
try:
|
|
||||||
acquired = await asyncio.shield(acquire_future)
|
|
||||||
except asyncio.CancelledError:
|
|
||||||
acquire_future.add_done_callback(lambda task: _release_cancelled_lock_acquire(lock, task))
|
|
||||||
raise
|
|
||||||
|
|
||||||
if not acquired:
|
|
||||||
raise RuntimeError("Failed to acquire sandbox thread lock")
|
|
||||||
|
|
||||||
|
|
||||||
def _release_cancelled_lock_acquire(lock: threading.Lock, task: asyncio.Future[bool]) -> None:
|
|
||||||
"""Release a lock acquired after its awaiting coroutine was cancelled."""
|
|
||||||
if task.cancelled():
|
|
||||||
return
|
|
||||||
|
|
||||||
try:
|
|
||||||
acquired = task.result()
|
|
||||||
except Exception as e:
|
|
||||||
logger.warning(f"Cancelled sandbox lock acquisition finished with error: {e}")
|
|
||||||
return
|
|
||||||
|
|
||||||
if acquired:
|
|
||||||
lock.release()
|
|
||||||
|
|
||||||
|
|
||||||
class AioSandboxProvider(SandboxProvider):
|
class AioSandboxProvider(SandboxProvider):
|
||||||
"""Sandbox provider that manages containers running the AIO sandbox.
|
"""Sandbox provider that manages containers running the AIO sandbox.
|
||||||
|
|
||||||
@@ -455,96 +416,6 @@ class AioSandboxProvider(SandboxProvider):
|
|||||||
self._thread_locks[thread_id] = threading.Lock()
|
self._thread_locks[thread_id] = threading.Lock()
|
||||||
return self._thread_locks[thread_id]
|
return self._thread_locks[thread_id]
|
||||||
|
|
||||||
def _sandbox_id_for_thread(self, thread_id: str | None) -> str:
|
|
||||||
"""Return deterministic IDs for thread sandboxes and random IDs otherwise."""
|
|
||||||
return self._deterministic_sandbox_id(thread_id) if thread_id else str(uuid.uuid4())[:8]
|
|
||||||
|
|
||||||
def _reuse_in_process_sandbox(self, thread_id: str | None, *, post_lock: bool = False) -> str | None:
|
|
||||||
"""Reuse an active in-process sandbox for a thread if one is still tracked."""
|
|
||||||
if thread_id is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
with self._lock:
|
|
||||||
if thread_id not in self._thread_sandboxes:
|
|
||||||
return None
|
|
||||||
|
|
||||||
existing_id = self._thread_sandboxes[thread_id]
|
|
||||||
if existing_id in self._sandboxes:
|
|
||||||
suffix = " (post-lock check)" if post_lock else ""
|
|
||||||
logger.info(f"Reusing in-process sandbox {existing_id} for thread {thread_id}{suffix}")
|
|
||||||
self._last_activity[existing_id] = time.time()
|
|
||||||
return existing_id
|
|
||||||
|
|
||||||
del self._thread_sandboxes[thread_id]
|
|
||||||
return None
|
|
||||||
|
|
||||||
def _reclaim_warm_pool_sandbox(self, thread_id: str | None, sandbox_id: str, *, post_lock: bool = False) -> str | None:
|
|
||||||
"""Promote a warm-pool sandbox back to active tracking if available."""
|
|
||||||
if thread_id is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
with self._lock:
|
|
||||||
if sandbox_id not in self._warm_pool:
|
|
||||||
return None
|
|
||||||
|
|
||||||
info, _ = self._warm_pool.pop(sandbox_id)
|
|
||||||
sandbox = AioSandbox(id=sandbox_id, base_url=info.sandbox_url)
|
|
||||||
self._sandboxes[sandbox_id] = sandbox
|
|
||||||
self._sandbox_infos[sandbox_id] = info
|
|
||||||
self._last_activity[sandbox_id] = time.time()
|
|
||||||
self._thread_sandboxes[thread_id] = sandbox_id
|
|
||||||
|
|
||||||
suffix = " (post-lock check)" if post_lock else f" at {info.sandbox_url}"
|
|
||||||
logger.info(f"Reclaimed warm-pool sandbox {sandbox_id} for thread {thread_id}{suffix}")
|
|
||||||
return sandbox_id
|
|
||||||
|
|
||||||
def _recheck_cached_sandbox(self, thread_id: str, sandbox_id: str) -> str | None:
|
|
||||||
"""Re-check in-memory caches after acquiring the cross-process file lock."""
|
|
||||||
return self._reuse_in_process_sandbox(thread_id, post_lock=True) or self._reclaim_warm_pool_sandbox(thread_id, sandbox_id, post_lock=True)
|
|
||||||
|
|
||||||
def _register_discovered_sandbox(self, thread_id: str, info: SandboxInfo) -> str:
|
|
||||||
"""Track a sandbox discovered through the backend."""
|
|
||||||
sandbox = AioSandbox(id=info.sandbox_id, base_url=info.sandbox_url)
|
|
||||||
with self._lock:
|
|
||||||
self._sandboxes[info.sandbox_id] = sandbox
|
|
||||||
self._sandbox_infos[info.sandbox_id] = info
|
|
||||||
self._last_activity[info.sandbox_id] = time.time()
|
|
||||||
self._thread_sandboxes[thread_id] = info.sandbox_id
|
|
||||||
|
|
||||||
logger.info(f"Discovered existing sandbox {info.sandbox_id} for thread {thread_id} at {info.sandbox_url}")
|
|
||||||
return info.sandbox_id
|
|
||||||
|
|
||||||
def _register_created_sandbox(self, thread_id: str | None, sandbox_id: str, info: SandboxInfo) -> str:
|
|
||||||
"""Track a newly-created sandbox in the active maps."""
|
|
||||||
sandbox = AioSandbox(id=sandbox_id, base_url=info.sandbox_url)
|
|
||||||
with self._lock:
|
|
||||||
self._sandboxes[sandbox_id] = sandbox
|
|
||||||
self._sandbox_infos[sandbox_id] = info
|
|
||||||
self._last_activity[sandbox_id] = time.time()
|
|
||||||
if thread_id:
|
|
||||||
self._thread_sandboxes[thread_id] = sandbox_id
|
|
||||||
|
|
||||||
logger.info(f"Created sandbox {sandbox_id} for thread {thread_id} at {info.sandbox_url}")
|
|
||||||
return sandbox_id
|
|
||||||
|
|
||||||
def _replica_count(self) -> tuple[int, int]:
|
|
||||||
"""Return configured replicas and currently tracked sandbox count."""
|
|
||||||
replicas = self._config.get("replicas", DEFAULT_REPLICAS)
|
|
||||||
with self._lock:
|
|
||||||
total = len(self._sandboxes) + len(self._warm_pool)
|
|
||||||
return replicas, total
|
|
||||||
|
|
||||||
def _log_replicas_soft_cap(self, replicas: int, sandbox_id: str, evicted: str | None) -> None:
|
|
||||||
"""Log the result of enforcing the warm-pool replica budget."""
|
|
||||||
if evicted:
|
|
||||||
logger.info(f"Evicted warm-pool sandbox {evicted} to stay within replicas={replicas}")
|
|
||||||
return
|
|
||||||
|
|
||||||
# All slots are occupied by active sandboxes — proceed anyway and log.
|
|
||||||
# The replicas limit is a soft cap; we never forcibly stop a container
|
|
||||||
# that is actively serving a thread.
|
|
||||||
logger.warning(f"All {replicas} replica slots are in active use; creating sandbox {sandbox_id} beyond the soft limit")
|
|
||||||
|
|
||||||
# ── Core: acquire / get / release / shutdown ─────────────────────────
|
# ── Core: acquire / get / release / shutdown ─────────────────────────
|
||||||
|
|
||||||
def acquire(self, thread_id: str | None = None) -> str:
|
def acquire(self, thread_id: str | None = None) -> str:
|
||||||
@@ -569,23 +440,6 @@ class AioSandboxProvider(SandboxProvider):
|
|||||||
else:
|
else:
|
||||||
return self._acquire_internal(thread_id)
|
return self._acquire_internal(thread_id)
|
||||||
|
|
||||||
async def acquire_async(self, thread_id: str | None = None) -> str:
|
|
||||||
"""Acquire a sandbox environment without blocking the event loop.
|
|
||||||
|
|
||||||
Mirrors ``acquire()`` while keeping blocking backend operations off the
|
|
||||||
event loop and using async-native readiness polling for newly created
|
|
||||||
sandboxes.
|
|
||||||
"""
|
|
||||||
if thread_id:
|
|
||||||
thread_lock = self._get_thread_lock(thread_id)
|
|
||||||
await _acquire_thread_lock_async(thread_lock)
|
|
||||||
try:
|
|
||||||
return await self._acquire_internal_async(thread_id)
|
|
||||||
finally:
|
|
||||||
thread_lock.release()
|
|
||||||
|
|
||||||
return await self._acquire_internal_async(thread_id)
|
|
||||||
|
|
||||||
def _acquire_internal(self, thread_id: str | None) -> str:
|
def _acquire_internal(self, thread_id: str | None) -> str:
|
||||||
"""Internal sandbox acquisition with two-layer consistency.
|
"""Internal sandbox acquisition with two-layer consistency.
|
||||||
|
|
||||||
@@ -594,17 +448,33 @@ class AioSandboxProvider(SandboxProvider):
|
|||||||
sandbox_id is deterministic from thread_id so no shared state file
|
sandbox_id is deterministic from thread_id so no shared state file
|
||||||
is needed — any process can derive the same container name)
|
is needed — any process can derive the same container name)
|
||||||
"""
|
"""
|
||||||
cached_id = self._reuse_in_process_sandbox(thread_id)
|
# ── Layer 1: In-process cache (fast path) ──
|
||||||
if cached_id is not None:
|
if thread_id:
|
||||||
return cached_id
|
with self._lock:
|
||||||
|
if thread_id in self._thread_sandboxes:
|
||||||
|
existing_id = self._thread_sandboxes[thread_id]
|
||||||
|
if existing_id in self._sandboxes:
|
||||||
|
logger.info(f"Reusing in-process sandbox {existing_id} for thread {thread_id}")
|
||||||
|
self._last_activity[existing_id] = time.time()
|
||||||
|
return existing_id
|
||||||
|
else:
|
||||||
|
del self._thread_sandboxes[thread_id]
|
||||||
|
|
||||||
# Deterministic ID for thread-specific, random for anonymous
|
# Deterministic ID for thread-specific, random for anonymous
|
||||||
sandbox_id = self._sandbox_id_for_thread(thread_id)
|
sandbox_id = self._deterministic_sandbox_id(thread_id) if thread_id else str(uuid.uuid4())[:8]
|
||||||
|
|
||||||
# ── Layer 1.5: Warm pool (container still running, no cold-start) ──
|
# ── Layer 1.5: Warm pool (container still running, no cold-start) ──
|
||||||
reclaimed_id = self._reclaim_warm_pool_sandbox(thread_id, sandbox_id)
|
if thread_id:
|
||||||
if reclaimed_id is not None:
|
with self._lock:
|
||||||
return reclaimed_id
|
if sandbox_id in self._warm_pool:
|
||||||
|
info, _ = self._warm_pool.pop(sandbox_id)
|
||||||
|
sandbox = AioSandbox(id=sandbox_id, base_url=info.sandbox_url)
|
||||||
|
self._sandboxes[sandbox_id] = sandbox
|
||||||
|
self._sandbox_infos[sandbox_id] = info
|
||||||
|
self._last_activity[sandbox_id] = time.time()
|
||||||
|
self._thread_sandboxes[thread_id] = sandbox_id
|
||||||
|
logger.info(f"Reclaimed warm-pool sandbox {sandbox_id} for thread {thread_id} at {info.sandbox_url}")
|
||||||
|
return sandbox_id
|
||||||
|
|
||||||
# ── Layer 2: Backend discovery + create (protected by cross-process lock) ──
|
# ── Layer 2: Backend discovery + create (protected by cross-process lock) ──
|
||||||
# Use a file lock so that two processes racing to create the same sandbox
|
# Use a file lock so that two processes racing to create the same sandbox
|
||||||
@@ -615,26 +485,6 @@ class AioSandboxProvider(SandboxProvider):
|
|||||||
|
|
||||||
return self._create_sandbox(thread_id, sandbox_id)
|
return self._create_sandbox(thread_id, sandbox_id)
|
||||||
|
|
||||||
async def _acquire_internal_async(self, thread_id: str | None) -> str:
|
|
||||||
"""Async counterpart to ``_acquire_internal``."""
|
|
||||||
cached_id = self._reuse_in_process_sandbox(thread_id)
|
|
||||||
if cached_id is not None:
|
|
||||||
return cached_id
|
|
||||||
|
|
||||||
# Deterministic ID for thread-specific, random for anonymous
|
|
||||||
sandbox_id = self._sandbox_id_for_thread(thread_id)
|
|
||||||
|
|
||||||
# ── Layer 1.5: Warm pool (container still running, no cold-start) ──
|
|
||||||
reclaimed_id = self._reclaim_warm_pool_sandbox(thread_id, sandbox_id)
|
|
||||||
if reclaimed_id is not None:
|
|
||||||
return reclaimed_id
|
|
||||||
|
|
||||||
# ── Layer 2: Backend discovery + create (protected by cross-process lock) ──
|
|
||||||
if thread_id:
|
|
||||||
return await self._discover_or_create_with_lock_async(thread_id, sandbox_id)
|
|
||||||
|
|
||||||
return await self._create_sandbox_async(thread_id, sandbox_id)
|
|
||||||
|
|
||||||
def _discover_or_create_with_lock(self, thread_id: str, sandbox_id: str) -> str:
|
def _discover_or_create_with_lock(self, thread_id: str, sandbox_id: str) -> str:
|
||||||
"""Discover an existing sandbox or create a new one under a cross-process file lock.
|
"""Discover an existing sandbox or create a new one under a cross-process file lock.
|
||||||
|
|
||||||
@@ -653,50 +503,40 @@ class AioSandboxProvider(SandboxProvider):
|
|||||||
locked = True
|
locked = True
|
||||||
# Re-check in-process caches under the file lock in case another
|
# Re-check in-process caches under the file lock in case another
|
||||||
# thread in this process won the race while we were waiting.
|
# thread in this process won the race while we were waiting.
|
||||||
cached_id = self._recheck_cached_sandbox(thread_id, sandbox_id)
|
with self._lock:
|
||||||
if cached_id is not None:
|
if thread_id in self._thread_sandboxes:
|
||||||
return cached_id
|
existing_id = self._thread_sandboxes[thread_id]
|
||||||
|
if existing_id in self._sandboxes:
|
||||||
|
logger.info(f"Reusing in-process sandbox {existing_id} for thread {thread_id} (post-lock check)")
|
||||||
|
self._last_activity[existing_id] = time.time()
|
||||||
|
return existing_id
|
||||||
|
if sandbox_id in self._warm_pool:
|
||||||
|
info, _ = self._warm_pool.pop(sandbox_id)
|
||||||
|
sandbox = AioSandbox(id=sandbox_id, base_url=info.sandbox_url)
|
||||||
|
self._sandboxes[sandbox_id] = sandbox
|
||||||
|
self._sandbox_infos[sandbox_id] = info
|
||||||
|
self._last_activity[sandbox_id] = time.time()
|
||||||
|
self._thread_sandboxes[thread_id] = sandbox_id
|
||||||
|
logger.info(f"Reclaimed warm-pool sandbox {sandbox_id} for thread {thread_id} (post-lock check)")
|
||||||
|
return sandbox_id
|
||||||
|
|
||||||
# Backend discovery: another process may have created the container.
|
# Backend discovery: another process may have created the container.
|
||||||
discovered = self._backend.discover(sandbox_id)
|
discovered = self._backend.discover(sandbox_id)
|
||||||
if discovered is not None:
|
if discovered is not None:
|
||||||
return self._register_discovered_sandbox(thread_id, discovered)
|
sandbox = AioSandbox(id=discovered.sandbox_id, base_url=discovered.sandbox_url)
|
||||||
|
with self._lock:
|
||||||
|
self._sandboxes[discovered.sandbox_id] = sandbox
|
||||||
|
self._sandbox_infos[discovered.sandbox_id] = discovered
|
||||||
|
self._last_activity[discovered.sandbox_id] = time.time()
|
||||||
|
self._thread_sandboxes[thread_id] = discovered.sandbox_id
|
||||||
|
logger.info(f"Discovered existing sandbox {discovered.sandbox_id} for thread {thread_id} at {discovered.sandbox_url}")
|
||||||
|
return discovered.sandbox_id
|
||||||
|
|
||||||
return self._create_sandbox(thread_id, sandbox_id)
|
return self._create_sandbox(thread_id, sandbox_id)
|
||||||
finally:
|
finally:
|
||||||
if locked:
|
if locked:
|
||||||
_unlock_file(lock_file)
|
_unlock_file(lock_file)
|
||||||
|
|
||||||
async def _discover_or_create_with_lock_async(self, thread_id: str, sandbox_id: str) -> str:
|
|
||||||
"""Async counterpart to ``_discover_or_create_with_lock``."""
|
|
||||||
paths = get_paths()
|
|
||||||
user_id = get_effective_user_id()
|
|
||||||
await asyncio.to_thread(paths.ensure_thread_dirs, thread_id, user_id=user_id)
|
|
||||||
lock_path = paths.thread_dir(thread_id, user_id=user_id) / f"{sandbox_id}.lock"
|
|
||||||
|
|
||||||
lock_file = await asyncio.to_thread(_open_lock_file, lock_path)
|
|
||||||
locked = False
|
|
||||||
try:
|
|
||||||
await asyncio.to_thread(_lock_file_exclusive, lock_file)
|
|
||||||
locked = True
|
|
||||||
# Re-check in-process caches under the file lock in case another
|
|
||||||
# thread in this process won the race while we were waiting.
|
|
||||||
cached_id = self._recheck_cached_sandbox(thread_id, sandbox_id)
|
|
||||||
if cached_id is not None:
|
|
||||||
return cached_id
|
|
||||||
|
|
||||||
# Backend discovery is sync because local discovery may inspect
|
|
||||||
# Docker and perform a health check; keep it off the event loop.
|
|
||||||
discovered = await asyncio.to_thread(self._backend.discover, sandbox_id)
|
|
||||||
if discovered is not None:
|
|
||||||
return self._register_discovered_sandbox(thread_id, discovered)
|
|
||||||
|
|
||||||
return await self._create_sandbox_async(thread_id, sandbox_id)
|
|
||||||
finally:
|
|
||||||
if locked:
|
|
||||||
await asyncio.to_thread(_unlock_file, lock_file)
|
|
||||||
await asyncio.to_thread(lock_file.close)
|
|
||||||
|
|
||||||
def _evict_oldest_warm(self) -> str | None:
|
def _evict_oldest_warm(self) -> str | None:
|
||||||
"""Destroy the oldest container in the warm pool to free capacity.
|
"""Destroy the oldest container in the warm pool to free capacity.
|
||||||
|
|
||||||
@@ -734,10 +574,18 @@ class AioSandboxProvider(SandboxProvider):
|
|||||||
|
|
||||||
# Enforce replicas: only warm-pool containers count toward eviction budget.
|
# Enforce replicas: only warm-pool containers count toward eviction budget.
|
||||||
# Active sandboxes are in use by live threads and must not be forcibly stopped.
|
# Active sandboxes are in use by live threads and must not be forcibly stopped.
|
||||||
replicas, total = self._replica_count()
|
replicas = self._config.get("replicas", DEFAULT_REPLICAS)
|
||||||
|
with self._lock:
|
||||||
|
total = len(self._sandboxes) + len(self._warm_pool)
|
||||||
if total >= replicas:
|
if total >= replicas:
|
||||||
evicted = self._evict_oldest_warm()
|
evicted = self._evict_oldest_warm()
|
||||||
self._log_replicas_soft_cap(replicas, sandbox_id, evicted)
|
if evicted:
|
||||||
|
logger.info(f"Evicted warm-pool sandbox {evicted} to stay within replicas={replicas}")
|
||||||
|
else:
|
||||||
|
# All slots are occupied by active sandboxes — proceed anyway and log.
|
||||||
|
# The replicas limit is a soft cap; we never forcibly stop a container
|
||||||
|
# that is actively serving a thread.
|
||||||
|
logger.warning(f"All {replicas} replica slots are in active use; creating sandbox {sandbox_id} beyond the soft limit")
|
||||||
|
|
||||||
info = self._backend.create(thread_id, sandbox_id, extra_mounts=extra_mounts or None)
|
info = self._backend.create(thread_id, sandbox_id, extra_mounts=extra_mounts or None)
|
||||||
|
|
||||||
@@ -746,27 +594,16 @@ class AioSandboxProvider(SandboxProvider):
|
|||||||
self._backend.destroy(info)
|
self._backend.destroy(info)
|
||||||
raise RuntimeError(f"Sandbox {sandbox_id} failed to become ready within timeout at {info.sandbox_url}")
|
raise RuntimeError(f"Sandbox {sandbox_id} failed to become ready within timeout at {info.sandbox_url}")
|
||||||
|
|
||||||
return self._register_created_sandbox(thread_id, sandbox_id, info)
|
sandbox = AioSandbox(id=sandbox_id, base_url=info.sandbox_url)
|
||||||
|
with self._lock:
|
||||||
|
self._sandboxes[sandbox_id] = sandbox
|
||||||
|
self._sandbox_infos[sandbox_id] = info
|
||||||
|
self._last_activity[sandbox_id] = time.time()
|
||||||
|
if thread_id:
|
||||||
|
self._thread_sandboxes[thread_id] = sandbox_id
|
||||||
|
|
||||||
async def _create_sandbox_async(self, thread_id: str | None, sandbox_id: str) -> str:
|
logger.info(f"Created sandbox {sandbox_id} for thread {thread_id} at {info.sandbox_url}")
|
||||||
"""Async counterpart to ``_create_sandbox``."""
|
return sandbox_id
|
||||||
extra_mounts = await asyncio.to_thread(self._get_extra_mounts, thread_id)
|
|
||||||
|
|
||||||
# Enforce replicas: only warm-pool containers count toward eviction budget.
|
|
||||||
# Active sandboxes are in use by live threads and must not be forcibly stopped.
|
|
||||||
replicas, total = self._replica_count()
|
|
||||||
if total >= replicas:
|
|
||||||
evicted = await asyncio.to_thread(self._evict_oldest_warm)
|
|
||||||
self._log_replicas_soft_cap(replicas, sandbox_id, evicted)
|
|
||||||
|
|
||||||
info = await asyncio.to_thread(self._backend.create, thread_id, sandbox_id, extra_mounts=extra_mounts or None)
|
|
||||||
|
|
||||||
# Wait for sandbox to be ready without blocking the event loop.
|
|
||||||
if not await wait_for_sandbox_ready_async(info.sandbox_url, timeout=60):
|
|
||||||
await asyncio.to_thread(self._backend.destroy, info)
|
|
||||||
raise RuntimeError(f"Sandbox {sandbox_id} failed to become ready within timeout at {info.sandbox_url}")
|
|
||||||
|
|
||||||
return self._register_created_sandbox(thread_id, sandbox_id, info)
|
|
||||||
|
|
||||||
def get(self, sandbox_id: str) -> Sandbox | None:
|
def get(self, sandbox_id: str) -> Sandbox | None:
|
||||||
"""Get a sandbox by ID. Updates last activity timestamp.
|
"""Get a sandbox by ID. Updates last activity timestamp.
|
||||||
|
|||||||
@@ -2,12 +2,10 @@
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import logging
|
import logging
|
||||||
import time
|
import time
|
||||||
from abc import ABC, abstractmethod
|
from abc import ABC, abstractmethod
|
||||||
|
|
||||||
import httpx
|
|
||||||
import requests
|
import requests
|
||||||
|
|
||||||
from .sandbox_info import SandboxInfo
|
from .sandbox_info import SandboxInfo
|
||||||
@@ -37,34 +35,6 @@ def wait_for_sandbox_ready(sandbox_url: str, timeout: int = 30) -> bool:
|
|||||||
return False
|
return False
|
||||||
|
|
||||||
|
|
||||||
async def wait_for_sandbox_ready_async(sandbox_url: str, timeout: int = 30, poll_interval: float = 1.0) -> bool:
|
|
||||||
"""Async variant of sandbox readiness polling.
|
|
||||||
|
|
||||||
Use this from async runtime paths so sandbox startup waits do not block the
|
|
||||||
event loop. The synchronous ``wait_for_sandbox_ready`` function remains for
|
|
||||||
existing synchronous backend/provider call sites.
|
|
||||||
"""
|
|
||||||
loop = asyncio.get_running_loop()
|
|
||||||
deadline = loop.time() + timeout
|
|
||||||
|
|
||||||
async with httpx.AsyncClient(timeout=5) as client:
|
|
||||||
while True:
|
|
||||||
remaining = deadline - loop.time()
|
|
||||||
if remaining <= 0:
|
|
||||||
break
|
|
||||||
try:
|
|
||||||
response = await client.get(f"{sandbox_url}/v1/sandbox", timeout=min(5.0, remaining))
|
|
||||||
if response.status_code == 200:
|
|
||||||
return True
|
|
||||||
except httpx.RequestError:
|
|
||||||
pass
|
|
||||||
remaining = deadline - loop.time()
|
|
||||||
if remaining <= 0:
|
|
||||||
break
|
|
||||||
await asyncio.sleep(min(poll_interval, remaining))
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
class SandboxBackend(ABC):
|
class SandboxBackend(ABC):
|
||||||
"""Abstract base for sandbox provisioning backends.
|
"""Abstract base for sandbox provisioning backends.
|
||||||
|
|
||||||
@@ -74,7 +44,7 @@ class SandboxBackend(ABC):
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
def create(self, thread_id: str | None, sandbox_id: str, extra_mounts: list[tuple[str, str, bool]] | None = None) -> SandboxInfo:
|
def create(self, thread_id: str, sandbox_id: str, extra_mounts: list[tuple[str, str, bool]] | None = None) -> SandboxInfo:
|
||||||
"""Create/provision a new sandbox.
|
"""Create/provision a new sandbox.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
|
|||||||
@@ -241,7 +241,7 @@ class LocalContainerBackend(SandboxBackend):
|
|||||||
|
|
||||||
# ── SandboxBackend interface ──────────────────────────────────────────
|
# ── SandboxBackend interface ──────────────────────────────────────────
|
||||||
|
|
||||||
def create(self, thread_id: str | None, sandbox_id: str, extra_mounts: list[tuple[str, str, bool]] | None = None) -> SandboxInfo:
|
def create(self, thread_id: str, sandbox_id: str, extra_mounts: list[tuple[str, str, bool]] | None = None) -> SandboxInfo:
|
||||||
"""Start a new container and return its connection info.
|
"""Start a new container and return its connection info.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
|
|||||||
@@ -21,8 +21,6 @@ import logging
|
|||||||
|
|
||||||
import requests
|
import requests
|
||||||
|
|
||||||
from deerflow.runtime.user_context import get_effective_user_id
|
|
||||||
|
|
||||||
from .backend import SandboxBackend
|
from .backend import SandboxBackend
|
||||||
from .sandbox_info import SandboxInfo
|
from .sandbox_info import SandboxInfo
|
||||||
|
|
||||||
@@ -59,7 +57,7 @@ class RemoteSandboxBackend(SandboxBackend):
|
|||||||
|
|
||||||
def create(
|
def create(
|
||||||
self,
|
self,
|
||||||
thread_id: str | None,
|
thread_id: str,
|
||||||
sandbox_id: str,
|
sandbox_id: str,
|
||||||
extra_mounts: list[tuple[str, str, bool]] | None = None,
|
extra_mounts: list[tuple[str, str, bool]] | None = None,
|
||||||
) -> SandboxInfo:
|
) -> SandboxInfo:
|
||||||
@@ -132,7 +130,7 @@ class RemoteSandboxBackend(SandboxBackend):
|
|||||||
logger.warning("Provisioner list_running failed: %s", exc)
|
logger.warning("Provisioner list_running failed: %s", exc)
|
||||||
return []
|
return []
|
||||||
|
|
||||||
def _provisioner_create(self, thread_id: str | None, sandbox_id: str, extra_mounts: list[tuple[str, str, bool]] | None = None) -> SandboxInfo:
|
def _provisioner_create(self, thread_id: str, sandbox_id: str, extra_mounts: list[tuple[str, str, bool]] | None = None) -> SandboxInfo:
|
||||||
"""POST /api/sandboxes → create Pod + Service."""
|
"""POST /api/sandboxes → create Pod + Service."""
|
||||||
try:
|
try:
|
||||||
resp = requests.post(
|
resp = requests.post(
|
||||||
@@ -140,7 +138,6 @@ class RemoteSandboxBackend(SandboxBackend):
|
|||||||
json={
|
json={
|
||||||
"sandbox_id": sandbox_id,
|
"sandbox_id": sandbox_id,
|
||||||
"thread_id": thread_id,
|
"thread_id": thread_id,
|
||||||
"user_id": get_effective_user_id(),
|
|
||||||
},
|
},
|
||||||
timeout=30,
|
timeout=30,
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -20,7 +20,6 @@ from deerflow.config.memory_config import MemoryConfig, load_memory_config_from_
|
|||||||
from deerflow.config.model_config import ModelConfig
|
from deerflow.config.model_config import ModelConfig
|
||||||
from deerflow.config.run_events_config import RunEventsConfig
|
from deerflow.config.run_events_config import RunEventsConfig
|
||||||
from deerflow.config.runtime_paths import existing_project_file
|
from deerflow.config.runtime_paths import existing_project_file
|
||||||
from deerflow.config.safety_finish_reason_config import SafetyFinishReasonConfig
|
|
||||||
from deerflow.config.sandbox_config import SandboxConfig
|
from deerflow.config.sandbox_config import SandboxConfig
|
||||||
from deerflow.config.skill_evolution_config import SkillEvolutionConfig
|
from deerflow.config.skill_evolution_config import SkillEvolutionConfig
|
||||||
from deerflow.config.skills_config import SkillsConfig
|
from deerflow.config.skills_config import SkillsConfig
|
||||||
@@ -30,7 +29,6 @@ from deerflow.config.summarization_config import SummarizationConfig, load_summa
|
|||||||
from deerflow.config.title_config import TitleConfig, load_title_config_from_dict
|
from deerflow.config.title_config import TitleConfig, load_title_config_from_dict
|
||||||
from deerflow.config.token_usage_config import TokenUsageConfig
|
from deerflow.config.token_usage_config import TokenUsageConfig
|
||||||
from deerflow.config.tool_config import ToolConfig, ToolGroupConfig
|
from deerflow.config.tool_config import ToolConfig, ToolGroupConfig
|
||||||
from deerflow.config.tool_output_config import ToolOutputConfig
|
|
||||||
from deerflow.config.tool_search_config import ToolSearchConfig, load_tool_search_config_from_dict
|
from deerflow.config.tool_search_config import ToolSearchConfig, load_tool_search_config_from_dict
|
||||||
|
|
||||||
load_dotenv()
|
load_dotenv()
|
||||||
@@ -94,7 +92,6 @@ class AppConfig(BaseModel):
|
|||||||
skills: SkillsConfig = Field(default_factory=SkillsConfig, description="Skills configuration")
|
skills: SkillsConfig = Field(default_factory=SkillsConfig, description="Skills configuration")
|
||||||
skill_evolution: SkillEvolutionConfig = Field(default_factory=SkillEvolutionConfig, description="Agent-managed skill evolution configuration")
|
skill_evolution: SkillEvolutionConfig = Field(default_factory=SkillEvolutionConfig, description="Agent-managed skill evolution configuration")
|
||||||
extensions: ExtensionsConfig = Field(default_factory=ExtensionsConfig, description="Extensions configuration (MCP servers and skills state)")
|
extensions: ExtensionsConfig = Field(default_factory=ExtensionsConfig, description="Extensions configuration (MCP servers and skills state)")
|
||||||
tool_output: ToolOutputConfig = Field(default_factory=ToolOutputConfig, description="Tool output budget protection configuration")
|
|
||||||
tool_search: ToolSearchConfig = Field(default_factory=ToolSearchConfig, description="Tool search / deferred loading configuration")
|
tool_search: ToolSearchConfig = Field(default_factory=ToolSearchConfig, description="Tool search / deferred loading configuration")
|
||||||
title: TitleConfig = Field(default_factory=TitleConfig, description="Automatic title generation configuration")
|
title: TitleConfig = Field(default_factory=TitleConfig, description="Automatic title generation configuration")
|
||||||
summarization: SummarizationConfig = Field(default_factory=SummarizationConfig, description="Conversation summarization configuration")
|
summarization: SummarizationConfig = Field(default_factory=SummarizationConfig, description="Conversation summarization configuration")
|
||||||
@@ -105,7 +102,6 @@ class AppConfig(BaseModel):
|
|||||||
guardrails: GuardrailsConfig = Field(default_factory=GuardrailsConfig, description="Guardrail middleware configuration")
|
guardrails: GuardrailsConfig = Field(default_factory=GuardrailsConfig, description="Guardrail middleware configuration")
|
||||||
circuit_breaker: CircuitBreakerConfig = Field(default_factory=CircuitBreakerConfig, description="LLM circuit breaker configuration")
|
circuit_breaker: CircuitBreakerConfig = Field(default_factory=CircuitBreakerConfig, description="LLM circuit breaker configuration")
|
||||||
loop_detection: LoopDetectionConfig = Field(default_factory=LoopDetectionConfig, description="Loop detection middleware configuration")
|
loop_detection: LoopDetectionConfig = Field(default_factory=LoopDetectionConfig, description="Loop detection middleware configuration")
|
||||||
safety_finish_reason: SafetyFinishReasonConfig = Field(default_factory=SafetyFinishReasonConfig, description="Provider safety-filter finish_reason interception middleware configuration")
|
|
||||||
model_config = ConfigDict(extra="allow")
|
model_config = ConfigDict(extra="allow")
|
||||||
database: DatabaseConfig = Field(default_factory=DatabaseConfig, description="Unified database backend configuration")
|
database: DatabaseConfig = Field(default_factory=DatabaseConfig, description="Unified database backend configuration")
|
||||||
run_events: RunEventsConfig = Field(default_factory=RunEventsConfig, description="Run event storage configuration")
|
run_events: RunEventsConfig = Field(default_factory=RunEventsConfig, description="Run event storage configuration")
|
||||||
|
|||||||
@@ -141,7 +141,7 @@ class ExtensionsConfig(BaseModel):
|
|||||||
try:
|
try:
|
||||||
with open(resolved_path, encoding="utf-8") as f:
|
with open(resolved_path, encoding="utf-8") as f:
|
||||||
config_data = json.load(f)
|
config_data = json.load(f)
|
||||||
config_data = cls.resolve_env_variables(config_data)
|
cls.resolve_env_variables(config_data)
|
||||||
return cls.model_validate(config_data)
|
return cls.model_validate(config_data)
|
||||||
except json.JSONDecodeError as e:
|
except json.JSONDecodeError as e:
|
||||||
raise ValueError(f"Extensions config file at {resolved_path} is not valid JSON: {e}") from e
|
raise ValueError(f"Extensions config file at {resolved_path} is not valid JSON: {e}") from e
|
||||||
@@ -149,7 +149,7 @@ class ExtensionsConfig(BaseModel):
|
|||||||
raise RuntimeError(f"Failed to load extensions config from {resolved_path}: {e}") from e
|
raise RuntimeError(f"Failed to load extensions config from {resolved_path}: {e}") from e
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def resolve_env_variables(cls, config: Any) -> Any:
|
def resolve_env_variables(cls, config: dict[str, Any]) -> dict[str, Any]:
|
||||||
"""Recursively resolve environment variables in the config.
|
"""Recursively resolve environment variables in the config.
|
||||||
|
|
||||||
Environment variables are resolved using the `os.getenv` function. Example: $OPENAI_API_KEY
|
Environment variables are resolved using the `os.getenv` function. Example: $OPENAI_API_KEY
|
||||||
@@ -160,26 +160,23 @@ class ExtensionsConfig(BaseModel):
|
|||||||
Returns:
|
Returns:
|
||||||
The config with environment variables resolved.
|
The config with environment variables resolved.
|
||||||
"""
|
"""
|
||||||
if isinstance(config, str):
|
for key, value in config.items():
|
||||||
if not config.startswith("$"):
|
if isinstance(value, str):
|
||||||
return config
|
if value.startswith("$"):
|
||||||
env_value = os.getenv(config[1:])
|
env_value = os.getenv(value[1:])
|
||||||
if env_value is None:
|
if env_value is None:
|
||||||
# Unresolved placeholder — store empty string so downstream
|
# Unresolved placeholder — store empty string so downstream
|
||||||
# consumers (e.g. MCP servers) don't receive the literal "$VAR"
|
# consumers (e.g. MCP servers) don't receive the literal "$VAR"
|
||||||
# token as an actual environment value.
|
# token as an actual environment value.
|
||||||
return ""
|
config[key] = ""
|
||||||
return env_value
|
else:
|
||||||
|
config[key] = env_value
|
||||||
if isinstance(config, dict):
|
else:
|
||||||
return {key: cls.resolve_env_variables(value) for key, value in config.items()}
|
config[key] = value
|
||||||
|
elif isinstance(value, dict):
|
||||||
if isinstance(config, list):
|
config[key] = cls.resolve_env_variables(value)
|
||||||
return [cls.resolve_env_variables(item) for item in config]
|
elif isinstance(value, list):
|
||||||
|
config[key] = [cls.resolve_env_variables(item) if isinstance(item, dict) else item for item in value]
|
||||||
if isinstance(config, tuple):
|
|
||||||
return tuple(cls.resolve_env_variables(item) for item in config)
|
|
||||||
|
|
||||||
return config
|
return config
|
||||||
|
|
||||||
def get_enabled_mcp_servers(self) -> dict[str, McpServerConfig]:
|
def get_enabled_mcp_servers(self) -> dict[str, McpServerConfig]:
|
||||||
|
|||||||
@@ -1,47 +0,0 @@
|
|||||||
"""Configuration for SafetyFinishReasonMiddleware.
|
|
||||||
|
|
||||||
Mirrors the shape of GuardrailsConfig: detectors are loaded by class path
|
|
||||||
through ``deerflow.reflection.resolve_variable`` (same loader the
|
|
||||||
``guardrails.provider`` config uses) so users can drop in custom provider
|
|
||||||
detectors without modifying core code.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from pydantic import BaseModel, Field
|
|
||||||
|
|
||||||
|
|
||||||
class SafetyDetectorConfig(BaseModel):
|
|
||||||
"""One detector entry under ``safety_finish_reason.detectors``."""
|
|
||||||
|
|
||||||
use: str = Field(
|
|
||||||
description=("Class path of a SafetyTerminationDetector implementation (e.g. 'deerflow.agents.middlewares.safety_termination_detectors:OpenAICompatibleContentFilterDetector')."),
|
|
||||||
)
|
|
||||||
config: dict = Field(
|
|
||||||
default_factory=dict,
|
|
||||||
description="Constructor kwargs passed to the detector class.",
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class SafetyFinishReasonConfig(BaseModel):
|
|
||||||
"""Configuration for the SafetyFinishReasonMiddleware.
|
|
||||||
|
|
||||||
The middleware intercepts AIMessages where the provider signaled a
|
|
||||||
safety-related termination (e.g. OpenAI ``finish_reason='content_filter'``)
|
|
||||||
while still returning tool calls, and suppresses those tool calls so the
|
|
||||||
half-truncated arguments never execute.
|
|
||||||
"""
|
|
||||||
|
|
||||||
enabled: bool = Field(
|
|
||||||
default=True,
|
|
||||||
description="Master switch for the SafetyFinishReasonMiddleware.",
|
|
||||||
)
|
|
||||||
detectors: list[SafetyDetectorConfig] | None = Field(
|
|
||||||
default=None,
|
|
||||||
description=(
|
|
||||||
"Custom detector list. Leave unset (None) to use the built-in "
|
|
||||||
"set covering OpenAI-compatible content_filter, Anthropic "
|
|
||||||
"refusal, and Gemini SAFETY/BLOCKLIST/PROHIBITED_CONTENT/SPII/"
|
|
||||||
"RECITATION. Provide a non-null list to fully override."
|
|
||||||
),
|
|
||||||
)
|
|
||||||
@@ -51,16 +51,3 @@ def load_title_config_from_dict(config_dict: dict) -> None:
|
|||||||
"""Load title configuration from a dictionary."""
|
"""Load title configuration from a dictionary."""
|
||||||
global _title_config
|
global _title_config
|
||||||
_title_config = TitleConfig(**config_dict)
|
_title_config = TitleConfig(**config_dict)
|
||||||
|
|
||||||
|
|
||||||
def reset_title_config() -> None:
|
|
||||||
"""Restore the title configuration to its pristine ``TitleConfig()`` default.
|
|
||||||
|
|
||||||
Public API so that tests do not have to reach into the private
|
|
||||||
``_title_config`` module attribute. ``AppConfig.from_file()`` calls
|
|
||||||
:func:`load_title_config_from_dict`, which permanently mutates the
|
|
||||||
singleton; tests that need a clean slate between cases should call
|
|
||||||
this between tests.
|
|
||||||
"""
|
|
||||||
global _title_config
|
|
||||||
_title_config = TitleConfig()
|
|
||||||
|
|||||||
@@ -1,62 +0,0 @@
|
|||||||
"""Configuration for tool output budget protection."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from pydantic import BaseModel, Field
|
|
||||||
|
|
||||||
|
|
||||||
class ToolOutputConfig(BaseModel):
|
|
||||||
"""Config section for tool-result output budget enforcement.
|
|
||||||
|
|
||||||
When a tool returns more than ``externalize_min_chars`` characters,
|
|
||||||
the full output is persisted to disk and replaced with a compact
|
|
||||||
preview + file reference. If disk persistence is unavailable the
|
|
||||||
output falls back to head+tail truncation.
|
|
||||||
"""
|
|
||||||
|
|
||||||
enabled: bool = Field(
|
|
||||||
default=True,
|
|
||||||
description="Enable the tool output budget middleware.",
|
|
||||||
)
|
|
||||||
externalize_min_chars: int = Field(
|
|
||||||
default=12_000,
|
|
||||||
ge=0,
|
|
||||||
description="Character threshold to trigger disk externalization. Outputs below this pass through unchanged. Set to 0 to disable externalization (fallback truncation still applies when output exceeds fallback_max_chars).",
|
|
||||||
)
|
|
||||||
preview_head_chars: int = Field(
|
|
||||||
default=2_000,
|
|
||||||
ge=0,
|
|
||||||
description="Characters to keep from the head of the output in the preview.",
|
|
||||||
)
|
|
||||||
preview_tail_chars: int = Field(
|
|
||||||
default=1_000,
|
|
||||||
ge=0,
|
|
||||||
description="Characters to keep from the tail of the output in the preview.",
|
|
||||||
)
|
|
||||||
fallback_max_chars: int = Field(
|
|
||||||
default=30_000,
|
|
||||||
ge=0,
|
|
||||||
description="Maximum characters when disk persistence is unavailable. 0 disables fallback truncation.",
|
|
||||||
)
|
|
||||||
fallback_head_chars: int = Field(
|
|
||||||
default=8_000,
|
|
||||||
ge=0,
|
|
||||||
description="Head characters for fallback truncation.",
|
|
||||||
)
|
|
||||||
fallback_tail_chars: int = Field(
|
|
||||||
default=3_000,
|
|
||||||
ge=0,
|
|
||||||
description="Tail characters for fallback truncation.",
|
|
||||||
)
|
|
||||||
storage_subdir: str = Field(
|
|
||||||
default=".tool-results",
|
|
||||||
description="Subdirectory under the thread outputs path for persisted tool results.",
|
|
||||||
)
|
|
||||||
exempt_tools: list[str] = Field(
|
|
||||||
default_factory=lambda: ["read_file", "read_file_tool"],
|
|
||||||
description="Tool names exempt from budget enforcement (prevents persist→read→persist loops).",
|
|
||||||
)
|
|
||||||
tool_overrides: dict[str, int] = Field(
|
|
||||||
default_factory=dict,
|
|
||||||
description="Per-tool externalize_min_chars overrides. Keys are tool names, values are char thresholds. Use 0 to disable externalization for a specific tool.",
|
|
||||||
)
|
|
||||||
@@ -147,15 +147,3 @@ def validate_enabled_tracing_providers() -> None:
|
|||||||
def is_tracing_enabled() -> bool:
|
def is_tracing_enabled() -> bool:
|
||||||
"""Check if any tracing provider is enabled and fully configured."""
|
"""Check if any tracing provider is enabled and fully configured."""
|
||||||
return get_tracing_config().is_configured
|
return get_tracing_config().is_configured
|
||||||
|
|
||||||
|
|
||||||
def reset_tracing_config() -> None:
|
|
||||||
"""Discard the cached :class:`TracingConfig` so the next call rebuilds it.
|
|
||||||
|
|
||||||
Public API so that tests do not have to reach into the private
|
|
||||||
``_tracing_config`` module attribute. A future internal rename would
|
|
||||||
silently break callers that mutate the attribute directly.
|
|
||||||
"""
|
|
||||||
global _tracing_config
|
|
||||||
with _config_lock:
|
|
||||||
_tracing_config = None
|
|
||||||
|
|||||||
@@ -87,7 +87,8 @@ def get_cached_mcp_tools() -> list[BaseTool]:
|
|||||||
|
|
||||||
Also checks if the config file has been modified since last initialization,
|
Also checks if the config file has been modified since last initialization,
|
||||||
and re-initializes if needed. This ensures that changes made through the
|
and re-initializes if needed. This ensures that changes made through the
|
||||||
Gateway API are reflected in the Gateway-embedded LangGraph runtime.
|
Gateway API (which runs in a separate process) are reflected in the
|
||||||
|
LangGraph Server.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
List of cached MCP tools.
|
List of cached MCP tools.
|
||||||
@@ -133,25 +134,9 @@ def reset_mcp_tools_cache() -> None:
|
|||||||
"""Reset the MCP tools cache.
|
"""Reset the MCP tools cache.
|
||||||
|
|
||||||
This is useful for testing or when you want to reload MCP tools.
|
This is useful for testing or when you want to reload MCP tools.
|
||||||
Also closes all persistent MCP sessions so they are recreated on
|
|
||||||
the next tool load.
|
|
||||||
"""
|
"""
|
||||||
global _mcp_tools_cache, _cache_initialized, _config_mtime
|
global _mcp_tools_cache, _cache_initialized, _config_mtime
|
||||||
_mcp_tools_cache = None
|
_mcp_tools_cache = None
|
||||||
_cache_initialized = False
|
_cache_initialized = False
|
||||||
_config_mtime = None
|
_config_mtime = None
|
||||||
|
|
||||||
# Close persistent sessions – they will be recreated by the next
|
|
||||||
# get_mcp_tools() call with the (possibly updated) connection config.
|
|
||||||
try:
|
|
||||||
from deerflow.mcp.session_pool import get_session_pool
|
|
||||||
|
|
||||||
pool = get_session_pool()
|
|
||||||
pool.close_all_sync()
|
|
||||||
except Exception:
|
|
||||||
logger.debug("Could not close MCP session pool on cache reset", exc_info=True)
|
|
||||||
|
|
||||||
from deerflow.mcp.session_pool import reset_session_pool
|
|
||||||
|
|
||||||
reset_session_pool()
|
|
||||||
logger.info("MCP tools cache reset")
|
logger.info("MCP tools cache reset")
|
||||||
|
|||||||
@@ -1,198 +0,0 @@
|
|||||||
"""Persistent MCP session pool for stateful tool calls.
|
|
||||||
|
|
||||||
When MCP tools are loaded via langchain-mcp-adapters with ``session=None``,
|
|
||||||
each tool call creates a new MCP session. For stateful servers like Playwright,
|
|
||||||
this means browser state (opened pages, filled forms) is lost between calls.
|
|
||||||
|
|
||||||
This module provides a session pool that maintains persistent MCP sessions,
|
|
||||||
scoped by ``(server_name, scope_key)`` — typically scope_key is the thread_id —
|
|
||||||
so that consecutive tool calls share the same session and server-side state.
|
|
||||||
Sessions are evicted in LRU order when the pool reaches capacity.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import logging
|
|
||||||
import threading
|
|
||||||
from collections import OrderedDict
|
|
||||||
from typing import Any
|
|
||||||
|
|
||||||
from mcp import ClientSession
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
class MCPSessionPool:
|
|
||||||
"""Manages persistent MCP sessions scoped by ``(server_name, scope_key)``."""
|
|
||||||
|
|
||||||
MAX_SESSIONS = 256
|
|
||||||
SESSION_CLOSE_TIMEOUT = 5.0 # seconds to wait when closing a session via run_coroutine_threadsafe
|
|
||||||
|
|
||||||
def __init__(self) -> None:
|
|
||||||
self._entries: OrderedDict[
|
|
||||||
tuple[str, str],
|
|
||||||
tuple[ClientSession, asyncio.AbstractEventLoop],
|
|
||||||
] = OrderedDict()
|
|
||||||
self._context_managers: dict[tuple[str, str], Any] = {}
|
|
||||||
# threading.Lock is not bound to any event loop, so it is safe to
|
|
||||||
# acquire from both async paths and sync/worker-thread paths.
|
|
||||||
self._lock = threading.Lock()
|
|
||||||
|
|
||||||
async def get_session(
|
|
||||||
self,
|
|
||||||
server_name: str,
|
|
||||||
scope_key: str,
|
|
||||||
connection: dict[str, Any],
|
|
||||||
) -> ClientSession:
|
|
||||||
"""Get or create a persistent MCP session.
|
|
||||||
|
|
||||||
If an existing session was created in a different event loop (e.g.
|
|
||||||
the sync-wrapper path), it is closed and replaced with a fresh one
|
|
||||||
in the current loop.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
server_name: MCP server name.
|
|
||||||
scope_key: Isolation key (typically thread_id).
|
|
||||||
connection: Connection configuration for ``create_session``.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
An initialized ``ClientSession``.
|
|
||||||
"""
|
|
||||||
key = (server_name, scope_key)
|
|
||||||
current_loop = asyncio.get_running_loop()
|
|
||||||
|
|
||||||
# Phase 1: inspect/mutate the registry under the thread lock (no awaits).
|
|
||||||
cms_to_close: list[tuple[tuple[str, str], Any]] = []
|
|
||||||
with self._lock:
|
|
||||||
if key in self._entries:
|
|
||||||
session, loop = self._entries[key]
|
|
||||||
if loop is current_loop:
|
|
||||||
self._entries.move_to_end(key)
|
|
||||||
return session
|
|
||||||
# Session belongs to a different event loop – evict it.
|
|
||||||
cm = self._context_managers.pop(key, None)
|
|
||||||
self._entries.pop(key)
|
|
||||||
if cm is not None:
|
|
||||||
cms_to_close.append((key, cm))
|
|
||||||
|
|
||||||
# Evict LRU entries when at capacity.
|
|
||||||
while len(self._entries) >= self.MAX_SESSIONS:
|
|
||||||
oldest_key = next(iter(self._entries))
|
|
||||||
cm = self._context_managers.pop(oldest_key, None)
|
|
||||||
self._entries.pop(oldest_key)
|
|
||||||
if cm is not None:
|
|
||||||
cms_to_close.append((oldest_key, cm))
|
|
||||||
|
|
||||||
# Phase 2: async cleanup outside the lock so we never await while holding it.
|
|
||||||
for close_key, cm in cms_to_close:
|
|
||||||
try:
|
|
||||||
await cm.__aexit__(None, None, None)
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Error closing MCP session %s", close_key, exc_info=True)
|
|
||||||
|
|
||||||
from langchain_mcp_adapters.sessions import create_session
|
|
||||||
|
|
||||||
cm = create_session(connection)
|
|
||||||
session = await cm.__aenter__()
|
|
||||||
await session.initialize()
|
|
||||||
|
|
||||||
# Phase 3: register the new session under the lock.
|
|
||||||
with self._lock:
|
|
||||||
self._entries[key] = (session, current_loop)
|
|
||||||
self._context_managers[key] = cm
|
|
||||||
logger.info("Created persistent MCP session for %s/%s", server_name, scope_key)
|
|
||||||
return session
|
|
||||||
|
|
||||||
# ------------------------------------------------------------------
|
|
||||||
# Cleanup helpers
|
|
||||||
# ------------------------------------------------------------------
|
|
||||||
|
|
||||||
async def _close_cm(self, key: tuple[str, str], cm: Any) -> None:
|
|
||||||
"""Close a single context manager (must be called WITHOUT the lock)."""
|
|
||||||
try:
|
|
||||||
await cm.__aexit__(None, None, None)
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Error closing MCP session %s", key, exc_info=True)
|
|
||||||
|
|
||||||
async def close_scope(self, scope_key: str) -> None:
|
|
||||||
"""Close all sessions for a given scope (e.g. thread_id)."""
|
|
||||||
with self._lock:
|
|
||||||
keys = [k for k in self._entries if k[1] == scope_key]
|
|
||||||
cms = [(k, self._context_managers.pop(k, None)) for k in keys]
|
|
||||||
for k in keys:
|
|
||||||
self._entries.pop(k, None)
|
|
||||||
for key, cm in cms:
|
|
||||||
if cm is not None:
|
|
||||||
await self._close_cm(key, cm)
|
|
||||||
|
|
||||||
async def close_server(self, server_name: str) -> None:
|
|
||||||
"""Close all sessions for a given server."""
|
|
||||||
with self._lock:
|
|
||||||
keys = [k for k in self._entries if k[0] == server_name]
|
|
||||||
cms = [(k, self._context_managers.pop(k, None)) for k in keys]
|
|
||||||
for k in keys:
|
|
||||||
self._entries.pop(k, None)
|
|
||||||
for key, cm in cms:
|
|
||||||
if cm is not None:
|
|
||||||
await self._close_cm(key, cm)
|
|
||||||
|
|
||||||
async def close_all(self) -> None:
|
|
||||||
"""Close every managed session."""
|
|
||||||
with self._lock:
|
|
||||||
cms = list(self._context_managers.items())
|
|
||||||
self._context_managers.clear()
|
|
||||||
self._entries.clear()
|
|
||||||
for key, cm in cms:
|
|
||||||
await self._close_cm(key, cm)
|
|
||||||
|
|
||||||
def close_all_sync(self) -> None:
|
|
||||||
"""Close all sessions using their owning event loops (synchronous).
|
|
||||||
|
|
||||||
Each session is closed on the loop it was created in, avoiding
|
|
||||||
cross-loop resource leaks. Safe to call from any thread without an
|
|
||||||
active event loop.
|
|
||||||
"""
|
|
||||||
with self._lock:
|
|
||||||
entries = list(self._entries.items())
|
|
||||||
cms = dict(self._context_managers)
|
|
||||||
self._entries.clear()
|
|
||||||
self._context_managers.clear()
|
|
||||||
|
|
||||||
for key, (_, loop) in entries:
|
|
||||||
cm = cms.get(key)
|
|
||||||
if cm is None or loop.is_closed():
|
|
||||||
continue
|
|
||||||
try:
|
|
||||||
if loop.is_running():
|
|
||||||
# Schedule on the owning loop from this (different) thread.
|
|
||||||
future = asyncio.run_coroutine_threadsafe(cm.__aexit__(None, None, None), loop)
|
|
||||||
future.result(timeout=self.SESSION_CLOSE_TIMEOUT)
|
|
||||||
else:
|
|
||||||
loop.run_until_complete(cm.__aexit__(None, None, None))
|
|
||||||
except Exception:
|
|
||||||
logger.debug("Error closing MCP session %s during sync close", key, exc_info=True)
|
|
||||||
|
|
||||||
|
|
||||||
# ------------------------------------------------------------------
|
|
||||||
# Module-level singleton
|
|
||||||
# ------------------------------------------------------------------
|
|
||||||
|
|
||||||
_pool: MCPSessionPool | None = None
|
|
||||||
_pool_lock = threading.Lock()
|
|
||||||
|
|
||||||
|
|
||||||
def get_session_pool() -> MCPSessionPool:
|
|
||||||
"""Return the global session-pool singleton."""
|
|
||||||
global _pool
|
|
||||||
if _pool is None:
|
|
||||||
with _pool_lock:
|
|
||||||
if _pool is None:
|
|
||||||
_pool = MCPSessionPool()
|
|
||||||
return _pool
|
|
||||||
|
|
||||||
|
|
||||||
def reset_session_pool() -> None:
|
|
||||||
"""Reset the singleton (for tests)."""
|
|
||||||
global _pool
|
|
||||||
_pool = None
|
|
||||||
@@ -1,183 +1,21 @@
|
|||||||
"""Load MCP tools using langchain-mcp-adapters with stdio session pooling."""
|
"""Load MCP tools using langchain-mcp-adapters."""
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import logging
|
import logging
|
||||||
from typing import Any
|
|
||||||
|
|
||||||
from langchain_core.tools import BaseTool, StructuredTool
|
from langchain_core.tools import BaseTool
|
||||||
from langgraph.config import get_config
|
|
||||||
|
|
||||||
from deerflow.config.extensions_config import ExtensionsConfig
|
from deerflow.config.extensions_config import ExtensionsConfig
|
||||||
from deerflow.mcp.client import build_servers_config
|
from deerflow.mcp.client import build_servers_config
|
||||||
from deerflow.mcp.oauth import build_oauth_tool_interceptor, get_initial_oauth_headers
|
from deerflow.mcp.oauth import build_oauth_tool_interceptor, get_initial_oauth_headers
|
||||||
from deerflow.mcp.session_pool import get_session_pool
|
|
||||||
from deerflow.reflection import resolve_variable
|
from deerflow.reflection import resolve_variable
|
||||||
from deerflow.tools.sync import make_sync_tool_wrapper
|
from deerflow.tools.sync import make_sync_tool_wrapper
|
||||||
from deerflow.tools.types import Runtime
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
def _extract_thread_id(runtime: Runtime | None) -> str:
|
|
||||||
"""Extract thread_id from the injected tool runtime or LangGraph config."""
|
|
||||||
if runtime is not None:
|
|
||||||
tid = runtime.context.get("thread_id") if runtime.context else None
|
|
||||||
if tid is not None:
|
|
||||||
return str(tid)
|
|
||||||
config = runtime.config or {}
|
|
||||||
tid = config.get("configurable", {}).get("thread_id")
|
|
||||||
if tid is not None:
|
|
||||||
return str(tid)
|
|
||||||
|
|
||||||
try:
|
|
||||||
tid = get_config().get("configurable", {}).get("thread_id")
|
|
||||||
return str(tid) if tid is not None else "default"
|
|
||||||
except RuntimeError:
|
|
||||||
return "default"
|
|
||||||
|
|
||||||
|
|
||||||
def _convert_call_tool_result(call_tool_result: Any) -> Any:
|
|
||||||
"""Convert an MCP CallToolResult to the LangChain ``content_and_artifact`` format.
|
|
||||||
|
|
||||||
Implements the same conversion logic as the adapter without relying on
|
|
||||||
the private ``langchain_mcp_adapters.tools._convert_call_tool_result`` symbol.
|
|
||||||
"""
|
|
||||||
from langchain_core.messages import ToolMessage
|
|
||||||
from langchain_core.messages.content import create_file_block, create_image_block, create_text_block
|
|
||||||
from langchain_core.tools import ToolException
|
|
||||||
from mcp.types import EmbeddedResource, ImageContent, ResourceLink, TextContent, TextResourceContents
|
|
||||||
|
|
||||||
# Pass ToolMessage through directly (interceptor short-circuit).
|
|
||||||
if isinstance(call_tool_result, ToolMessage):
|
|
||||||
return call_tool_result, None
|
|
||||||
|
|
||||||
# Pass LangGraph Command through directly when langgraph is installed.
|
|
||||||
try:
|
|
||||||
from langgraph.types import Command
|
|
||||||
|
|
||||||
if isinstance(call_tool_result, Command):
|
|
||||||
return call_tool_result, None
|
|
||||||
except ImportError:
|
|
||||||
# langgraph is optional; if unavailable, continue with standard MCP content conversion.
|
|
||||||
pass
|
|
||||||
|
|
||||||
# Convert MCP content blocks to LangChain content blocks.
|
|
||||||
lc_content = []
|
|
||||||
for item in call_tool_result.content:
|
|
||||||
if isinstance(item, TextContent):
|
|
||||||
lc_content.append(create_text_block(text=item.text))
|
|
||||||
elif isinstance(item, ImageContent):
|
|
||||||
lc_content.append(create_image_block(base64=item.data, mime_type=item.mimeType))
|
|
||||||
elif isinstance(item, ResourceLink):
|
|
||||||
mime = item.mimeType or None
|
|
||||||
if mime and mime.startswith("image/"):
|
|
||||||
lc_content.append(create_image_block(url=str(item.uri), mime_type=mime))
|
|
||||||
else:
|
|
||||||
lc_content.append(create_file_block(url=str(item.uri), mime_type=mime))
|
|
||||||
elif isinstance(item, EmbeddedResource):
|
|
||||||
from mcp.types import BlobResourceContents
|
|
||||||
|
|
||||||
res = item.resource
|
|
||||||
if isinstance(res, TextResourceContents):
|
|
||||||
lc_content.append(create_text_block(text=res.text))
|
|
||||||
elif isinstance(res, BlobResourceContents):
|
|
||||||
mime = res.mimeType or None
|
|
||||||
if mime and mime.startswith("image/"):
|
|
||||||
lc_content.append(create_image_block(base64=res.blob, mime_type=mime))
|
|
||||||
else:
|
|
||||||
lc_content.append(create_file_block(base64=res.blob, mime_type=mime))
|
|
||||||
else:
|
|
||||||
lc_content.append(create_text_block(text=str(res)))
|
|
||||||
else:
|
|
||||||
lc_content.append(create_text_block(text=str(item)))
|
|
||||||
|
|
||||||
if call_tool_result.isError:
|
|
||||||
error_parts = [item["text"] for item in lc_content if isinstance(item, dict) and item.get("type") == "text"]
|
|
||||||
raise ToolException("\n".join(error_parts) if error_parts else str(lc_content))
|
|
||||||
|
|
||||||
artifact = None
|
|
||||||
if call_tool_result.structuredContent is not None:
|
|
||||||
artifact = {"structured_content": call_tool_result.structuredContent}
|
|
||||||
|
|
||||||
return lc_content, artifact
|
|
||||||
|
|
||||||
|
|
||||||
def _make_session_pool_tool(
|
|
||||||
tool: BaseTool,
|
|
||||||
server_name: str,
|
|
||||||
connection: dict[str, Any],
|
|
||||||
tool_interceptors: list[Any] | None = None,
|
|
||||||
) -> BaseTool:
|
|
||||||
"""Wrap an MCP tool so it reuses a persistent session from the pool.
|
|
||||||
|
|
||||||
Replaces the per-call session creation with pool-managed sessions scoped
|
|
||||||
by ``(server_name, thread_id)``. This ensures stateful MCP servers (e.g.
|
|
||||||
Playwright) keep their state across tool calls within the same thread.
|
|
||||||
|
|
||||||
The configured ``tool_interceptors`` (OAuth, custom) are preserved and
|
|
||||||
applied on every call before invoking the pooled session.
|
|
||||||
"""
|
|
||||||
# Strip the server-name prefix to recover the original MCP tool name.
|
|
||||||
original_name = tool.name
|
|
||||||
prefix = f"{server_name}_"
|
|
||||||
if original_name.startswith(prefix):
|
|
||||||
original_name = original_name[len(prefix) :]
|
|
||||||
|
|
||||||
pool = get_session_pool()
|
|
||||||
|
|
||||||
async def call_with_persistent_session(
|
|
||||||
runtime: Runtime | None = None,
|
|
||||||
**arguments: Any,
|
|
||||||
) -> Any:
|
|
||||||
thread_id = _extract_thread_id(runtime)
|
|
||||||
session = await pool.get_session(server_name, thread_id, connection)
|
|
||||||
|
|
||||||
if tool_interceptors:
|
|
||||||
from langchain_mcp_adapters.interceptors import MCPToolCallRequest
|
|
||||||
|
|
||||||
async def base_handler(request: MCPToolCallRequest) -> Any:
|
|
||||||
return await session.call_tool(request.name, request.args)
|
|
||||||
|
|
||||||
handler = base_handler
|
|
||||||
for interceptor in reversed(tool_interceptors):
|
|
||||||
outer = handler
|
|
||||||
|
|
||||||
async def wrapped(req: Any, _i: Any = interceptor, _h: Any = outer) -> Any:
|
|
||||||
return await _i(req, _h)
|
|
||||||
|
|
||||||
handler = wrapped
|
|
||||||
|
|
||||||
request = MCPToolCallRequest(
|
|
||||||
name=original_name,
|
|
||||||
args=arguments,
|
|
||||||
server_name=server_name,
|
|
||||||
runtime=runtime,
|
|
||||||
)
|
|
||||||
call_tool_result = await handler(request)
|
|
||||||
else:
|
|
||||||
call_tool_result = await session.call_tool(original_name, arguments)
|
|
||||||
|
|
||||||
return _convert_call_tool_result(call_tool_result)
|
|
||||||
|
|
||||||
return StructuredTool(
|
|
||||||
name=tool.name,
|
|
||||||
description=tool.description,
|
|
||||||
args_schema=tool.args_schema,
|
|
||||||
coroutine=call_with_persistent_session,
|
|
||||||
response_format="content_and_artifact",
|
|
||||||
metadata=tool.metadata,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
async def get_mcp_tools() -> list[BaseTool]:
|
async def get_mcp_tools() -> list[BaseTool]:
|
||||||
"""Get all tools from enabled MCP servers.
|
"""Get all tools from enabled MCP servers.
|
||||||
|
|
||||||
Tools using stdio transport are wrapped with persistent-session logic so
|
|
||||||
consecutive calls within the same thread reuse the same MCP session.
|
|
||||||
HTTP/SSE tools are returned unwrapped to avoid cross-task TaskGroup
|
|
||||||
cleanup errors.
|
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
List of LangChain tools from all enabled MCP servers.
|
List of LangChain tools from all enabled MCP servers.
|
||||||
"""
|
"""
|
||||||
@@ -212,7 +50,7 @@ async def get_mcp_tools() -> list[BaseTool]:
|
|||||||
existing_headers["Authorization"] = auth_header
|
existing_headers["Authorization"] = auth_header
|
||||||
servers_config[server_name]["headers"] = existing_headers
|
servers_config[server_name]["headers"] = existing_headers
|
||||||
|
|
||||||
tool_interceptors: list[Any] = []
|
tool_interceptors = []
|
||||||
oauth_interceptor = build_oauth_tool_interceptor(extensions_config)
|
oauth_interceptor = build_oauth_tool_interceptor(extensions_config)
|
||||||
if oauth_interceptor is not None:
|
if oauth_interceptor is not None:
|
||||||
tool_interceptors.append(oauth_interceptor)
|
tool_interceptors.append(oauth_interceptor)
|
||||||
@@ -236,49 +74,20 @@ async def get_mcp_tools() -> list[BaseTool]:
|
|||||||
elif interceptor is not None:
|
elif interceptor is not None:
|
||||||
logger.warning(f"Builder {interceptor_path} returned non-callable {type(interceptor).__name__}; skipping")
|
logger.warning(f"Builder {interceptor_path} returned non-callable {type(interceptor).__name__}; skipping")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.warning(
|
logger.warning(f"Failed to load MCP interceptor {interceptor_path}: {e}", exc_info=True)
|
||||||
f"Failed to load MCP interceptor {interceptor_path}: {e}",
|
|
||||||
exc_info=True,
|
|
||||||
)
|
|
||||||
|
|
||||||
client = MultiServerMCPClient(
|
client = MultiServerMCPClient(servers_config, tool_interceptors=tool_interceptors, tool_name_prefix=True)
|
||||||
servers_config,
|
|
||||||
tool_interceptors=tool_interceptors,
|
|
||||||
tool_name_prefix=True,
|
|
||||||
)
|
|
||||||
|
|
||||||
# Get all tools from all servers (discovers tool definitions via
|
# Get all tools from all servers
|
||||||
# temporary sessions – the persistent-session wrapping is applied below).
|
|
||||||
tools = await client.get_tools()
|
tools = await client.get_tools()
|
||||||
logger.info(f"Successfully loaded {len(tools)} tool(s) from MCP servers")
|
logger.info(f"Successfully loaded {len(tools)} tool(s) from MCP servers")
|
||||||
|
|
||||||
# Wrap each tool with persistent-session logic.
|
|
||||||
# Only pool stdio sessions. HTTP/SSE transports use anyio TaskGroups
|
|
||||||
# internally which cannot be closed from a different async task, so
|
|
||||||
# pooling them causes RuntimeError on cleanup (see #3203).
|
|
||||||
wrapped_tools: list[BaseTool] = []
|
|
||||||
for tool in tools:
|
|
||||||
tool_server: str | None = None
|
|
||||||
for name in servers_config:
|
|
||||||
if tool.name.startswith(f"{name}_"):
|
|
||||||
tool_server = name
|
|
||||||
break
|
|
||||||
|
|
||||||
if tool_server is not None:
|
|
||||||
transport = servers_config[tool_server].get("transport", "stdio")
|
|
||||||
if transport == "stdio":
|
|
||||||
wrapped_tools.append(_make_session_pool_tool(tool, tool_server, servers_config[tool_server], tool_interceptors))
|
|
||||||
else:
|
|
||||||
wrapped_tools.append(tool)
|
|
||||||
else:
|
|
||||||
wrapped_tools.append(tool)
|
|
||||||
|
|
||||||
# Patch tools to support sync invocation, as deerflow client streams synchronously
|
# Patch tools to support sync invocation, as deerflow client streams synchronously
|
||||||
for tool in wrapped_tools:
|
for tool in tools:
|
||||||
if getattr(tool, "func", None) is None and getattr(tool, "coroutine", None) is not None:
|
if getattr(tool, "func", None) is None and getattr(tool, "coroutine", None) is not None:
|
||||||
tool.func = make_sync_tool_wrapper(tool.coroutine, tool.name)
|
tool.func = make_sync_tool_wrapper(tool.coroutine, tool.name)
|
||||||
|
|
||||||
return wrapped_tools
|
return tools
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"Failed to load MCP tools: {e}", exc_info=True)
|
logger.error(f"Failed to load MCP tools: {e}", exc_info=True)
|
||||||
|
|||||||
@@ -1,124 +0,0 @@
|
|||||||
"""Helpers for replaying provider-specific assistant message fields.
|
|
||||||
|
|
||||||
Several provider adapters need to preserve fields that LangChain stores on the
|
|
||||||
original ``AIMessage`` but drops when serializing request payloads. This module
|
|
||||||
keeps the assistant-message matching logic shared while letting each provider
|
|
||||||
decide which fields to restore.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import json
|
|
||||||
from collections.abc import Callable, Sequence
|
|
||||||
from typing import Any
|
|
||||||
|
|
||||||
from langchain_core.messages import AIMessage, BaseMessage
|
|
||||||
|
|
||||||
AssistantPayloadRestorer = Callable[[dict[str, Any], AIMessage], None]
|
|
||||||
|
|
||||||
|
|
||||||
def restore_assistant_payloads(
|
|
||||||
payload_messages: Sequence[dict[str, Any]],
|
|
||||||
original_messages: Sequence[BaseMessage],
|
|
||||||
restore: AssistantPayloadRestorer,
|
|
||||||
) -> None:
|
|
||||||
"""Restore provider-specific fields onto serialized assistant payloads."""
|
|
||||||
if len(payload_messages) == len(original_messages):
|
|
||||||
for payload_msg, orig_msg in zip(payload_messages, original_messages):
|
|
||||||
if payload_msg.get("role") == "assistant" and isinstance(orig_msg, AIMessage):
|
|
||||||
restore(payload_msg, orig_msg)
|
|
||||||
return
|
|
||||||
|
|
||||||
ai_messages = [m for m in original_messages if isinstance(m, AIMessage)]
|
|
||||||
assistant_payloads = [m for m in payload_messages if m.get("role") == "assistant"]
|
|
||||||
used_ai_indexes: set[int] = set()
|
|
||||||
|
|
||||||
for ordinal, payload_msg in enumerate(assistant_payloads):
|
|
||||||
ai_msg = _match_ai_message(payload_msg, ai_messages, used_ai_indexes, ordinal)
|
|
||||||
if ai_msg is not None:
|
|
||||||
restore(payload_msg, ai_msg)
|
|
||||||
|
|
||||||
|
|
||||||
def restore_additional_kwargs_field(payload_msg: dict[str, Any], orig_msg: AIMessage, field_name: str) -> None:
|
|
||||||
"""Copy a provider-specific ``additional_kwargs`` field onto a payload message."""
|
|
||||||
value = orig_msg.additional_kwargs.get(field_name)
|
|
||||||
if value is not None:
|
|
||||||
payload_msg[field_name] = value
|
|
||||||
|
|
||||||
|
|
||||||
def restore_reasoning_content(payload_msg: dict[str, Any], orig_msg: AIMessage) -> None:
|
|
||||||
"""Copy provider reasoning content onto a serialized assistant payload."""
|
|
||||||
restore_additional_kwargs_field(payload_msg, orig_msg, "reasoning_content")
|
|
||||||
|
|
||||||
|
|
||||||
def _match_ai_message(
|
|
||||||
payload_msg: dict[str, Any],
|
|
||||||
ai_messages: Sequence[AIMessage],
|
|
||||||
used_ai_indexes: set[int],
|
|
||||||
fallback_ordinal: int,
|
|
||||||
) -> AIMessage | None:
|
|
||||||
payload_key = _assistant_signature(payload_msg)
|
|
||||||
if payload_key is not None:
|
|
||||||
matches = [index for index, ai_msg in enumerate(ai_messages) if index not in used_ai_indexes and _ai_signature(ai_msg) == payload_key]
|
|
||||||
if len(matches) == 1:
|
|
||||||
used_ai_indexes.add(matches[0])
|
|
||||||
return ai_messages[matches[0]]
|
|
||||||
|
|
||||||
fallback_index = _next_unused_index_at_or_after(len(ai_messages), used_ai_indexes, fallback_ordinal)
|
|
||||||
if fallback_index is not None:
|
|
||||||
used_ai_indexes.add(fallback_index)
|
|
||||||
return ai_messages[fallback_index]
|
|
||||||
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def _next_unused_index_at_or_after(count: int, used_ai_indexes: set[int], start: int) -> int | None:
|
|
||||||
"""Return the next unused AI index at or after ``start``.
|
|
||||||
|
|
||||||
Scanning forward from the payload's ordinal preserves the positional bias of
|
|
||||||
the previous behaviour while still recovering when serialization drops or
|
|
||||||
reorders messages so the exact ordinal index is already taken. It does not
|
|
||||||
wrap to earlier indexes because those messages may be represented by payload
|
|
||||||
entries that were already dropped.
|
|
||||||
"""
|
|
||||||
if count == 0 or start >= count:
|
|
||||||
return None
|
|
||||||
for index in range(start, count):
|
|
||||||
if index not in used_ai_indexes:
|
|
||||||
return index
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def _assistant_signature(payload_msg: dict[str, Any]) -> tuple[str, str] | None:
|
|
||||||
return _signature(
|
|
||||||
payload_msg.get("content"),
|
|
||||||
_tool_call_ids(payload_msg.get("tool_calls") or []),
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _ai_signature(message: AIMessage) -> tuple[str, str] | None:
|
|
||||||
tool_calls = message.tool_calls or message.additional_kwargs.get("tool_calls") or []
|
|
||||||
return _signature(message.content, _tool_call_ids(tool_calls))
|
|
||||||
|
|
||||||
|
|
||||||
def _signature(content: Any, tool_call_ids: tuple[str, ...]) -> tuple[str, str] | None:
|
|
||||||
if content in (None, "") and not tool_call_ids:
|
|
||||||
return None
|
|
||||||
return (_stable_repr(content), "|".join(tool_call_ids))
|
|
||||||
|
|
||||||
|
|
||||||
def _stable_repr(value: Any) -> str:
|
|
||||||
try:
|
|
||||||
return json.dumps(value, sort_keys=True, ensure_ascii=False)
|
|
||||||
except TypeError:
|
|
||||||
return repr(value)
|
|
||||||
|
|
||||||
|
|
||||||
def _tool_call_ids(tool_calls: Sequence[Any]) -> tuple[str, ...]:
|
|
||||||
ids: list[str] = []
|
|
||||||
for tool_call in tool_calls:
|
|
||||||
if isinstance(tool_call, dict):
|
|
||||||
call_id = tool_call.get("id")
|
|
||||||
if isinstance(call_id, str) and call_id:
|
|
||||||
ids.append(call_id)
|
|
||||||
return tuple(ids)
|
|
||||||
@@ -47,24 +47,11 @@ def _enable_stream_usage_by_default(model_use_path: str, model_settings_from_con
|
|||||||
model_settings_from_config["stream_usage"] = True
|
model_settings_from_config["stream_usage"] = True
|
||||||
|
|
||||||
|
|
||||||
def create_chat_model(name: str | None = None, thinking_enabled: bool = False, *, app_config: AppConfig | None = None, attach_tracing: bool = True, **kwargs) -> BaseChatModel:
|
def create_chat_model(name: str | None = None, thinking_enabled: bool = False, *, app_config: AppConfig | None = None, **kwargs) -> BaseChatModel:
|
||||||
"""Create a chat model instance from the config.
|
"""Create a chat model instance from the config.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
name: The name of the model to create. If None, the first model in the config will be used.
|
name: The name of the model to create. If None, the first model in the config will be used.
|
||||||
thinking_enabled: Enable the model's extended-thinking mode when supported.
|
|
||||||
app_config: Explicit application config; falls back to the cached global if omitted.
|
|
||||||
attach_tracing: When True (default), attach tracing callbacks (Langfuse,
|
|
||||||
LangSmith) directly to the model instance. Standalone callers — anything
|
|
||||||
that invokes the model outside a LangGraph run that already wires tracing
|
|
||||||
at the invocation root (``MemoryUpdater``, ad-hoc utilities, etc.) — keep
|
|
||||||
this default so the model-level callback still produces traces. Callers
|
|
||||||
that already attach tracing at the graph root (``make_lead_agent``, the
|
|
||||||
in-graph ``TitleMiddleware``) MUST pass ``attach_tracing=False``; otherwise
|
|
||||||
the same LLM call emits duplicate spans (one rooted at the graph, one at
|
|
||||||
the model) and ``session_id`` / ``user_id`` metadata never reach the trace
|
|
||||||
because the model becomes a nested observation whose ``langfuse_*`` keys
|
|
||||||
get stripped.
|
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
A chat model instance.
|
A chat model instance.
|
||||||
@@ -162,10 +149,9 @@ def create_chat_model(name: str | None = None, thinking_enabled: bool = False, *
|
|||||||
|
|
||||||
model_instance = model_class(**kwargs, **model_settings_from_config)
|
model_instance = model_class(**kwargs, **model_settings_from_config)
|
||||||
|
|
||||||
if attach_tracing:
|
callbacks = build_tracing_callbacks()
|
||||||
callbacks = build_tracing_callbacks()
|
if callbacks:
|
||||||
if callbacks:
|
existing_callbacks = model_instance.callbacks or []
|
||||||
existing_callbacks = model_instance.callbacks or []
|
model_instance.callbacks = [*existing_callbacks, *callbacks]
|
||||||
model_instance.callbacks = [*existing_callbacks, *callbacks]
|
logger.debug(f"Tracing attached to model '{name}' with providers={len(callbacks)}")
|
||||||
logger.debug(f"Tracing attached to model '{name}' with providers={len(callbacks)}")
|
|
||||||
return model_instance
|
return model_instance
|
||||||
|
|||||||
@@ -10,10 +10,9 @@ on all assistant messages when thinking mode is enabled.
|
|||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
from langchain_core.language_models import LanguageModelInput
|
from langchain_core.language_models import LanguageModelInput
|
||||||
|
from langchain_core.messages import AIMessage
|
||||||
from langchain_deepseek import ChatDeepSeek
|
from langchain_deepseek import ChatDeepSeek
|
||||||
|
|
||||||
from deerflow.models.assistant_payload_replay import restore_assistant_payloads, restore_reasoning_content
|
|
||||||
|
|
||||||
|
|
||||||
class PatchedChatDeepSeek(ChatDeepSeek):
|
class PatchedChatDeepSeek(ChatDeepSeek):
|
||||||
"""ChatDeepSeek with proper reasoning_content preservation.
|
"""ChatDeepSeek with proper reasoning_content preservation.
|
||||||
@@ -50,10 +49,25 @@ class PatchedChatDeepSeek(ChatDeepSeek):
|
|||||||
# Call parent to get the base payload
|
# Call parent to get the base payload
|
||||||
payload = super()._get_request_payload(input_, stop=stop, **kwargs)
|
payload = super()._get_request_payload(input_, stop=stop, **kwargs)
|
||||||
|
|
||||||
restore_assistant_payloads(
|
# Match payload messages with original messages to restore reasoning_content
|
||||||
payload.get("messages", []),
|
payload_messages = payload.get("messages", [])
|
||||||
original_messages,
|
|
||||||
restore_reasoning_content,
|
# The payload messages and original messages should be in the same order
|
||||||
)
|
# Iterate through both and match by position
|
||||||
|
if len(payload_messages) == len(original_messages):
|
||||||
|
for payload_msg, orig_msg in zip(payload_messages, original_messages):
|
||||||
|
if payload_msg.get("role") == "assistant" and isinstance(orig_msg, AIMessage):
|
||||||
|
reasoning_content = orig_msg.additional_kwargs.get("reasoning_content")
|
||||||
|
if reasoning_content is not None:
|
||||||
|
payload_msg["reasoning_content"] = reasoning_content
|
||||||
|
else:
|
||||||
|
# Fallback: match by counting assistant messages
|
||||||
|
ai_messages = [m for m in original_messages if isinstance(m, AIMessage)]
|
||||||
|
assistant_payloads = [(i, m) for i, m in enumerate(payload_messages) if m.get("role") == "assistant"]
|
||||||
|
|
||||||
|
for (idx, payload_msg), ai_msg in zip(assistant_payloads, ai_messages):
|
||||||
|
reasoning_content = ai_msg.additional_kwargs.get("reasoning_content")
|
||||||
|
if reasoning_content is not None:
|
||||||
|
payload_messages[idx]["reasoning_content"] = reasoning_content
|
||||||
|
|
||||||
return payload
|
return payload
|
||||||
|
|||||||
@@ -1,140 +0,0 @@
|
|||||||
"""Patched ChatOpenAI adapter for Xiaomi MiMo reasoning_content replay.
|
|
||||||
|
|
||||||
MiMo's OpenAI-compatible API returns ``reasoning_content`` in thinking mode and
|
|
||||||
requires that value to be replayed on historical assistant messages in
|
|
||||||
multi-turn agent conversations. Standard ``langchain_openai.ChatOpenAI`` drops
|
|
||||||
that provider-specific field, which can cause HTTP 400 errors once tool calls
|
|
||||||
enter the conversation history.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from collections.abc import Mapping
|
|
||||||
from typing import Any
|
|
||||||
|
|
||||||
from langchain_core.language_models import LanguageModelInput
|
|
||||||
from langchain_core.messages import AIMessage, AIMessageChunk
|
|
||||||
from langchain_core.outputs import ChatGeneration, ChatGenerationChunk, ChatResult
|
|
||||||
from langchain_openai import ChatOpenAI
|
|
||||||
|
|
||||||
from deerflow.models.assistant_payload_replay import restore_assistant_payloads, restore_reasoning_content
|
|
||||||
|
|
||||||
_MISSING = object()
|
|
||||||
|
|
||||||
|
|
||||||
def _extract_reasoning_content(value: Any) -> str | object:
|
|
||||||
"""Return reasoning_content from a dict/Pydantic object, preserving empty strings."""
|
|
||||||
if isinstance(value, Mapping):
|
|
||||||
if "reasoning_content" in value and value["reasoning_content"] is not None:
|
|
||||||
return value["reasoning_content"]
|
|
||||||
return _MISSING
|
|
||||||
|
|
||||||
reasoning = getattr(value, "reasoning_content", _MISSING)
|
|
||||||
if reasoning is not _MISSING and reasoning is not None:
|
|
||||||
return reasoning
|
|
||||||
|
|
||||||
model_extra = getattr(value, "model_extra", None)
|
|
||||||
if isinstance(model_extra, Mapping) and "reasoning_content" in model_extra and model_extra["reasoning_content"] is not None:
|
|
||||||
return model_extra["reasoning_content"]
|
|
||||||
|
|
||||||
return _MISSING
|
|
||||||
|
|
||||||
|
|
||||||
def _with_reasoning_content(message: AIMessage | AIMessageChunk, reasoning: str) -> AIMessage | AIMessageChunk:
|
|
||||||
additional_kwargs = dict(message.additional_kwargs)
|
|
||||||
if additional_kwargs.get("reasoning_content") != reasoning:
|
|
||||||
additional_kwargs["reasoning_content"] = reasoning
|
|
||||||
return message.model_copy(update={"additional_kwargs": additional_kwargs})
|
|
||||||
|
|
||||||
|
|
||||||
def _get_typed_choice_message(response: Any, index: int) -> Any:
|
|
||||||
choices = getattr(response, "choices", None)
|
|
||||||
if choices is None:
|
|
||||||
return None
|
|
||||||
try:
|
|
||||||
return choices[index].message
|
|
||||||
except (AttributeError, IndexError, TypeError):
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
class PatchedChatMiMo(ChatOpenAI):
|
|
||||||
"""ChatOpenAI with ``reasoning_content`` preservation for MiMo thinking mode."""
|
|
||||||
|
|
||||||
@classmethod
|
|
||||||
def is_lc_serializable(cls) -> bool:
|
|
||||||
return True
|
|
||||||
|
|
||||||
@property
|
|
||||||
def lc_secrets(self) -> dict[str, str]:
|
|
||||||
return {"api_key": "MIMO_API_KEY", "openai_api_key": "MIMO_API_KEY"}
|
|
||||||
|
|
||||||
def _get_request_payload(
|
|
||||||
self,
|
|
||||||
input_: LanguageModelInput,
|
|
||||||
*,
|
|
||||||
stop: list[str] | None = None,
|
|
||||||
**kwargs: Any,
|
|
||||||
) -> dict:
|
|
||||||
original_messages = self._convert_input(input_).to_messages()
|
|
||||||
payload = super()._get_request_payload(input_, stop=stop, **kwargs)
|
|
||||||
restore_assistant_payloads(
|
|
||||||
payload.get("messages", []),
|
|
||||||
original_messages,
|
|
||||||
restore_reasoning_content,
|
|
||||||
)
|
|
||||||
|
|
||||||
return payload
|
|
||||||
|
|
||||||
def _convert_chunk_to_generation_chunk(
|
|
||||||
self,
|
|
||||||
chunk: dict,
|
|
||||||
default_chunk_class: type,
|
|
||||||
base_generation_info: dict | None,
|
|
||||||
) -> ChatGenerationChunk | None:
|
|
||||||
generation_chunk = super()._convert_chunk_to_generation_chunk(
|
|
||||||
chunk,
|
|
||||||
default_chunk_class,
|
|
||||||
base_generation_info,
|
|
||||||
)
|
|
||||||
if generation_chunk is None:
|
|
||||||
return None
|
|
||||||
|
|
||||||
choices = chunk.get("choices", [])
|
|
||||||
if choices:
|
|
||||||
delta = choices[0].get("delta") or {}
|
|
||||||
reasoning = _extract_reasoning_content(delta)
|
|
||||||
if reasoning is not _MISSING and isinstance(generation_chunk.message, AIMessageChunk):
|
|
||||||
generation_chunk = ChatGenerationChunk(
|
|
||||||
message=_with_reasoning_content(generation_chunk.message, reasoning),
|
|
||||||
generation_info=generation_chunk.generation_info,
|
|
||||||
)
|
|
||||||
|
|
||||||
return generation_chunk
|
|
||||||
|
|
||||||
def _create_chat_result(
|
|
||||||
self,
|
|
||||||
response: dict | Any,
|
|
||||||
generation_info: dict | None = None,
|
|
||||||
) -> ChatResult:
|
|
||||||
result = super()._create_chat_result(response, generation_info)
|
|
||||||
response_dict = response if isinstance(response, dict) else response.model_dump()
|
|
||||||
choices = response_dict.get("choices", [])
|
|
||||||
|
|
||||||
patched_generations: list[ChatGeneration] | None = None
|
|
||||||
for index, generation in enumerate(result.generations):
|
|
||||||
choice = choices[index] if index < len(choices) else {}
|
|
||||||
choice_message = choice.get("message", {}) if isinstance(choice, Mapping) else {}
|
|
||||||
reasoning = _extract_reasoning_content(choice_message)
|
|
||||||
if reasoning is _MISSING and not isinstance(response, dict):
|
|
||||||
reasoning = _extract_reasoning_content(_get_typed_choice_message(response, index))
|
|
||||||
|
|
||||||
message = generation.message
|
|
||||||
if reasoning is not _MISSING and isinstance(message, AIMessage):
|
|
||||||
if patched_generations is None:
|
|
||||||
patched_generations = list(result.generations)
|
|
||||||
patched_generations[index] = ChatGeneration(
|
|
||||||
message=_with_reasoning_content(message, reasoning),
|
|
||||||
generation_info=generation.generation_info,
|
|
||||||
)
|
|
||||||
|
|
||||||
return ChatResult(generations=patched_generations or result.generations, llm_output=result.llm_output)
|
|
||||||
@@ -27,8 +27,6 @@ from langchain_core.language_models import LanguageModelInput
|
|||||||
from langchain_core.messages import AIMessage
|
from langchain_core.messages import AIMessage
|
||||||
from langchain_openai import ChatOpenAI
|
from langchain_openai import ChatOpenAI
|
||||||
|
|
||||||
from deerflow.models.assistant_payload_replay import restore_assistant_payloads
|
|
||||||
|
|
||||||
|
|
||||||
class PatchedChatOpenAI(ChatOpenAI):
|
class PatchedChatOpenAI(ChatOpenAI):
|
||||||
"""ChatOpenAI with ``thought_signature`` preservation for Gemini thinking via OpenAI gateway.
|
"""ChatOpenAI with ``thought_signature`` preservation for Gemini thinking via OpenAI gateway.
|
||||||
@@ -77,7 +75,18 @@ class PatchedChatOpenAI(ChatOpenAI):
|
|||||||
# Obtain the base payload from the parent implementation.
|
# Obtain the base payload from the parent implementation.
|
||||||
payload = super()._get_request_payload(input_, stop=stop, **kwargs)
|
payload = super()._get_request_payload(input_, stop=stop, **kwargs)
|
||||||
|
|
||||||
restore_assistant_payloads(payload.get("messages", []), original_messages, _restore_tool_call_signatures)
|
payload_messages = payload.get("messages", [])
|
||||||
|
|
||||||
|
if len(payload_messages) == len(original_messages):
|
||||||
|
for payload_msg, orig_msg in zip(payload_messages, original_messages):
|
||||||
|
if payload_msg.get("role") == "assistant" and isinstance(orig_msg, AIMessage):
|
||||||
|
_restore_tool_call_signatures(payload_msg, orig_msg)
|
||||||
|
else:
|
||||||
|
# Fallback: match assistant-role entries positionally against AIMessages.
|
||||||
|
ai_messages = [m for m in original_messages if isinstance(m, AIMessage)]
|
||||||
|
assistant_payloads = [(i, m) for i, m in enumerate(payload_messages) if m.get("role") == "assistant"]
|
||||||
|
for (_, payload_msg), ai_msg in zip(assistant_payloads, ai_messages):
|
||||||
|
_restore_tool_call_signatures(payload_msg, ai_msg)
|
||||||
|
|
||||||
return payload
|
return payload
|
||||||
|
|
||||||
|
|||||||
@@ -13,7 +13,6 @@ from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker
|
|||||||
|
|
||||||
from deerflow.persistence.feedback.model import FeedbackRow
|
from deerflow.persistence.feedback.model import FeedbackRow
|
||||||
from deerflow.runtime.user_context import AUTO, _AutoSentinel, resolve_user_id
|
from deerflow.runtime.user_context import AUTO, _AutoSentinel, resolve_user_id
|
||||||
from deerflow.utils.time import coerce_iso
|
|
||||||
|
|
||||||
|
|
||||||
class FeedbackRepository:
|
class FeedbackRepository:
|
||||||
@@ -25,8 +24,7 @@ class FeedbackRepository:
|
|||||||
d = row.to_dict()
|
d = row.to_dict()
|
||||||
val = d.get("created_at")
|
val = d.get("created_at")
|
||||||
if isinstance(val, datetime):
|
if isinstance(val, datetime):
|
||||||
# SQLite drops tzinfo on read; normalize via ``coerce_iso`` so output is always tz-aware.
|
d["created_at"] = val.isoformat()
|
||||||
d["created_at"] = coerce_iso(val)
|
|
||||||
return d
|
return d
|
||||||
|
|
||||||
async def create(
|
async def create(
|
||||||
|
|||||||
@@ -17,7 +17,6 @@ from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker
|
|||||||
from deerflow.persistence.run.model import RunRow
|
from deerflow.persistence.run.model import RunRow
|
||||||
from deerflow.runtime.runs.store.base import RunStore
|
from deerflow.runtime.runs.store.base import RunStore
|
||||||
from deerflow.runtime.user_context import AUTO, _AutoSentinel, resolve_user_id
|
from deerflow.runtime.user_context import AUTO, _AutoSentinel, resolve_user_id
|
||||||
from deerflow.utils.time import coerce_iso
|
|
||||||
|
|
||||||
|
|
||||||
class RunRepository(RunStore):
|
class RunRepository(RunStore):
|
||||||
@@ -69,13 +68,11 @@ class RunRepository(RunStore):
|
|||||||
# Remap JSON columns to match RunStore interface
|
# Remap JSON columns to match RunStore interface
|
||||||
d["metadata"] = d.pop("metadata_json", {})
|
d["metadata"] = d.pop("metadata_json", {})
|
||||||
d["kwargs"] = d.pop("kwargs_json", {})
|
d["kwargs"] = d.pop("kwargs_json", {})
|
||||||
# Convert datetime to ISO string for consistency with MemoryRunStore.
|
# Convert datetime to ISO string for consistency with MemoryRunStore
|
||||||
# SQLite drops tzinfo on read despite ``DateTime(timezone=True)`` —
|
|
||||||
# ``coerce_iso`` normalizes naive datetimes as UTC.
|
|
||||||
for key in ("created_at", "updated_at"):
|
for key in ("created_at", "updated_at"):
|
||||||
val = d.get(key)
|
val = d.get(key)
|
||||||
if isinstance(val, datetime):
|
if isinstance(val, datetime):
|
||||||
d[key] = coerce_iso(val)
|
d[key] = val.isoformat()
|
||||||
return d
|
return d
|
||||||
|
|
||||||
async def put(
|
async def put(
|
||||||
@@ -94,35 +91,25 @@ class RunRepository(RunStore):
|
|||||||
created_at=None,
|
created_at=None,
|
||||||
follow_up_to_run_id=None,
|
follow_up_to_run_id=None,
|
||||||
):
|
):
|
||||||
"""Insert or update a run row.
|
|
||||||
|
|
||||||
``RunManager`` retries ``put`` after transient SQLite failures. Making
|
|
||||||
this operation idempotent prevents a successful-but-unacknowledged first
|
|
||||||
commit from turning the retry into a primary-key failure.
|
|
||||||
"""
|
|
||||||
resolved_user_id = resolve_user_id(user_id, method_name="RunRepository.put")
|
resolved_user_id = resolve_user_id(user_id, method_name="RunRepository.put")
|
||||||
now = datetime.now(UTC)
|
now = datetime.now(UTC)
|
||||||
created = datetime.fromisoformat(created_at) if created_at else now
|
row = RunRow(
|
||||||
values = {
|
run_id=run_id,
|
||||||
"thread_id": thread_id,
|
thread_id=thread_id,
|
||||||
"assistant_id": assistant_id,
|
assistant_id=assistant_id,
|
||||||
"user_id": resolved_user_id,
|
user_id=resolved_user_id,
|
||||||
"model_name": self._normalize_model_name(model_name),
|
model_name=self._normalize_model_name(model_name),
|
||||||
"status": status,
|
status=status,
|
||||||
"multitask_strategy": multitask_strategy,
|
multitask_strategy=multitask_strategy,
|
||||||
"metadata_json": self._safe_json(metadata) or {},
|
metadata_json=self._safe_json(metadata) or {},
|
||||||
"kwargs_json": self._safe_json(kwargs) or {},
|
kwargs_json=self._safe_json(kwargs) or {},
|
||||||
"error": error,
|
error=error,
|
||||||
"follow_up_to_run_id": follow_up_to_run_id,
|
follow_up_to_run_id=follow_up_to_run_id,
|
||||||
"updated_at": now,
|
created_at=datetime.fromisoformat(created_at) if created_at else now,
|
||||||
}
|
updated_at=now,
|
||||||
|
)
|
||||||
async with self._sf() as session:
|
async with self._sf() as session:
|
||||||
row = await session.get(RunRow, run_id)
|
session.add(row)
|
||||||
if row is None:
|
|
||||||
session.add(RunRow(run_id=run_id, created_at=created, **values))
|
|
||||||
else:
|
|
||||||
for key, value in values.items():
|
|
||||||
setattr(row, key, value)
|
|
||||||
await session.commit()
|
await session.commit()
|
||||||
|
|
||||||
async def get(
|
async def get(
|
||||||
@@ -156,18 +143,12 @@ class RunRepository(RunStore):
|
|||||||
result = await session.execute(stmt)
|
result = await session.execute(stmt)
|
||||||
return [self._row_to_dict(r) for r in result.scalars()]
|
return [self._row_to_dict(r) for r in result.scalars()]
|
||||||
|
|
||||||
async def update_status(self, run_id, status, *, error=None) -> bool:
|
async def update_status(self, run_id, status, *, error=None):
|
||||||
values: dict[str, Any] = {"status": status, "updated_at": datetime.now(UTC)}
|
values: dict[str, Any] = {"status": status, "updated_at": datetime.now(UTC)}
|
||||||
if error is not None:
|
if error is not None:
|
||||||
values["error"] = error
|
values["error"] = error
|
||||||
async with self._sf() as session:
|
async with self._sf() as session:
|
||||||
result = await session.execute(update(RunRow).where(RunRow.run_id == run_id).values(**values))
|
await session.execute(update(RunRow).where(RunRow.run_id == run_id).values(**values))
|
||||||
await session.commit()
|
|
||||||
return result.rowcount != 0
|
|
||||||
|
|
||||||
async def update_model_name(self, run_id, model_name):
|
|
||||||
async with self._sf() as session:
|
|
||||||
await session.execute(update(RunRow).where(RunRow.run_id == run_id).values(model_name=self._normalize_model_name(model_name), updated_at=datetime.now(UTC)))
|
|
||||||
await session.commit()
|
await session.commit()
|
||||||
|
|
||||||
async def delete(
|
async def delete(
|
||||||
@@ -198,26 +179,6 @@ class RunRepository(RunStore):
|
|||||||
result = await session.execute(stmt)
|
result = await session.execute(stmt)
|
||||||
return [self._row_to_dict(r) for r in result.scalars()]
|
return [self._row_to_dict(r) for r in result.scalars()]
|
||||||
|
|
||||||
async def list_inflight(self, *, before=None):
|
|
||||||
"""Return persisted active runs for startup recovery."""
|
|
||||||
if before is None:
|
|
||||||
before_dt = datetime.now(UTC)
|
|
||||||
elif isinstance(before, datetime):
|
|
||||||
before_dt = before
|
|
||||||
else:
|
|
||||||
before_dt = datetime.fromisoformat(before)
|
|
||||||
stmt = (
|
|
||||||
select(RunRow)
|
|
||||||
.where(
|
|
||||||
RunRow.status.in_(("pending", "running")),
|
|
||||||
RunRow.created_at <= before_dt,
|
|
||||||
)
|
|
||||||
.order_by(RunRow.created_at.asc())
|
|
||||||
)
|
|
||||||
async with self._sf() as session:
|
|
||||||
result = await session.execute(stmt)
|
|
||||||
return [self._row_to_dict(r) for r in result.scalars()]
|
|
||||||
|
|
||||||
async def update_run_completion(
|
async def update_run_completion(
|
||||||
self,
|
self,
|
||||||
run_id: str,
|
run_id: str,
|
||||||
@@ -234,11 +195,8 @@ class RunRepository(RunStore):
|
|||||||
last_ai_message: str | None = None,
|
last_ai_message: str | None = None,
|
||||||
first_human_message: str | None = None,
|
first_human_message: str | None = None,
|
||||||
error: str | None = None,
|
error: str | None = None,
|
||||||
) -> bool:
|
) -> None:
|
||||||
"""Update status + token usage + convenience fields on run completion.
|
"""Update status + token usage + convenience fields on run completion."""
|
||||||
|
|
||||||
Returns ``False`` when no run row matched the requested ``run_id``.
|
|
||||||
"""
|
|
||||||
values: dict[str, Any] = {
|
values: dict[str, Any] = {
|
||||||
"status": status,
|
"status": status,
|
||||||
"total_input_tokens": total_input_tokens,
|
"total_input_tokens": total_input_tokens,
|
||||||
@@ -258,52 +216,12 @@ class RunRepository(RunStore):
|
|||||||
if error is not None:
|
if error is not None:
|
||||||
values["error"] = error
|
values["error"] = error
|
||||||
async with self._sf() as session:
|
async with self._sf() as session:
|
||||||
result = await session.execute(update(RunRow).where(RunRow.run_id == run_id).values(**values))
|
await session.execute(update(RunRow).where(RunRow.run_id == run_id).values(**values))
|
||||||
await session.commit()
|
|
||||||
return result.rowcount != 0
|
|
||||||
|
|
||||||
async def update_run_progress(
|
|
||||||
self,
|
|
||||||
run_id: str,
|
|
||||||
*,
|
|
||||||
total_input_tokens: int | None = None,
|
|
||||||
total_output_tokens: int | None = None,
|
|
||||||
total_tokens: int | None = None,
|
|
||||||
llm_call_count: int | None = None,
|
|
||||||
lead_agent_tokens: int | None = None,
|
|
||||||
subagent_tokens: int | None = None,
|
|
||||||
middleware_tokens: int | None = None,
|
|
||||||
message_count: int | None = None,
|
|
||||||
last_ai_message: str | None = None,
|
|
||||||
first_human_message: str | None = None,
|
|
||||||
) -> None:
|
|
||||||
"""Update token usage + convenience fields while a run is still active."""
|
|
||||||
values: dict[str, Any] = {"updated_at": datetime.now(UTC)}
|
|
||||||
optional_counters = {
|
|
||||||
"total_input_tokens": total_input_tokens,
|
|
||||||
"total_output_tokens": total_output_tokens,
|
|
||||||
"total_tokens": total_tokens,
|
|
||||||
"llm_call_count": llm_call_count,
|
|
||||||
"lead_agent_tokens": lead_agent_tokens,
|
|
||||||
"subagent_tokens": subagent_tokens,
|
|
||||||
"middleware_tokens": middleware_tokens,
|
|
||||||
"message_count": message_count,
|
|
||||||
}
|
|
||||||
for key, value in optional_counters.items():
|
|
||||||
if value is not None:
|
|
||||||
values[key] = value
|
|
||||||
if last_ai_message is not None:
|
|
||||||
values["last_ai_message"] = last_ai_message[:2000]
|
|
||||||
if first_human_message is not None:
|
|
||||||
values["first_human_message"] = first_human_message[:2000]
|
|
||||||
async with self._sf() as session:
|
|
||||||
await session.execute(update(RunRow).where(RunRow.run_id == run_id, RunRow.status == "running").values(**values))
|
|
||||||
await session.commit()
|
await session.commit()
|
||||||
|
|
||||||
async def aggregate_tokens_by_thread(self, thread_id: str, *, include_active: bool = False) -> dict[str, Any]:
|
async def aggregate_tokens_by_thread(self, thread_id: str) -> dict[str, Any]:
|
||||||
"""Aggregate token usage via a single SQL GROUP BY query."""
|
"""Aggregate token usage via a single SQL GROUP BY query."""
|
||||||
statuses = ("success", "error", "running") if include_active else ("success", "error")
|
_completed = RunRow.status.in_(("success", "error"))
|
||||||
_completed = RunRow.status.in_(statuses)
|
|
||||||
_thread = RunRow.thread_id == thread_id
|
_thread = RunRow.thread_id == thread_id
|
||||||
model_name = func.coalesce(RunRow.model_name, "unknown")
|
model_name = func.coalesce(RunRow.model_name, "unknown")
|
||||||
|
|
||||||
|
|||||||
@@ -13,7 +13,6 @@ from deerflow.persistence.json_compat import json_match
|
|||||||
from deerflow.persistence.thread_meta.base import InvalidMetadataFilterError, ThreadMetaStore
|
from deerflow.persistence.thread_meta.base import InvalidMetadataFilterError, ThreadMetaStore
|
||||||
from deerflow.persistence.thread_meta.model import ThreadMetaRow
|
from deerflow.persistence.thread_meta.model import ThreadMetaRow
|
||||||
from deerflow.runtime.user_context import AUTO, _AutoSentinel, resolve_user_id
|
from deerflow.runtime.user_context import AUTO, _AutoSentinel, resolve_user_id
|
||||||
from deerflow.utils.time import coerce_iso
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
@@ -29,9 +28,7 @@ class ThreadMetaRepository(ThreadMetaStore):
|
|||||||
for key in ("created_at", "updated_at"):
|
for key in ("created_at", "updated_at"):
|
||||||
val = d.get(key)
|
val = d.get(key)
|
||||||
if isinstance(val, datetime):
|
if isinstance(val, datetime):
|
||||||
# SQLite drops tzinfo despite ``DateTime(timezone=True)``;
|
d[key] = val.isoformat()
|
||||||
# ``coerce_iso`` normalizes naive values as UTC so the wire format always carries tz.
|
|
||||||
d[key] = coerce_iso(val)
|
|
||||||
return d
|
return d
|
||||||
|
|
||||||
async def create(
|
async def create(
|
||||||
|
|||||||
@@ -34,19 +34,6 @@ from deerflow.runtime.store._sqlite_utils import ensure_sqlite_parent_dir, resol
|
|||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
def _prepare_sqlite_checkpointer_path(raw: str) -> str:
|
|
||||||
conn_str = resolve_sqlite_conn_str(raw)
|
|
||||||
ensure_sqlite_parent_dir(conn_str)
|
|
||||||
return conn_str
|
|
||||||
|
|
||||||
|
|
||||||
def _prepare_database_sqlite_checkpointer_path(db_config) -> str:
|
|
||||||
conn_str = db_config.checkpointer_sqlite_path
|
|
||||||
ensure_sqlite_parent_dir(conn_str)
|
|
||||||
return conn_str
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# Async factory
|
# Async factory
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
@@ -67,7 +54,8 @@ async def _async_checkpointer(config) -> AsyncIterator[Checkpointer]:
|
|||||||
except ImportError as exc:
|
except ImportError as exc:
|
||||||
raise ImportError(SQLITE_INSTALL) from exc
|
raise ImportError(SQLITE_INSTALL) from exc
|
||||||
|
|
||||||
conn_str = await asyncio.to_thread(_prepare_sqlite_checkpointer_path, config.connection_string or "store.db")
|
conn_str = resolve_sqlite_conn_str(config.connection_string or "store.db")
|
||||||
|
await asyncio.to_thread(ensure_sqlite_parent_dir, conn_str)
|
||||||
async with AsyncSqliteSaver.from_conn_string(conn_str) as saver:
|
async with AsyncSqliteSaver.from_conn_string(conn_str) as saver:
|
||||||
await saver.setup()
|
await saver.setup()
|
||||||
yield saver
|
yield saver
|
||||||
@@ -110,7 +98,8 @@ async def _async_checkpointer_from_database(db_config) -> AsyncIterator[Checkpoi
|
|||||||
except ImportError as exc:
|
except ImportError as exc:
|
||||||
raise ImportError(SQLITE_INSTALL) from exc
|
raise ImportError(SQLITE_INSTALL) from exc
|
||||||
|
|
||||||
conn_str = await asyncio.to_thread(_prepare_database_sqlite_checkpointer_path, db_config)
|
conn_str = db_config.checkpointer_sqlite_path
|
||||||
|
ensure_sqlite_parent_dir(conn_str)
|
||||||
async with AsyncSqliteSaver.from_conn_string(conn_str) as saver:
|
async with AsyncSqliteSaver.from_conn_string(conn_str) as saver:
|
||||||
await saver.setup()
|
await saver.setup()
|
||||||
yield saver
|
yield saver
|
||||||
|
|||||||
@@ -17,7 +17,6 @@ from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker
|
|||||||
from deerflow.persistence.models.run_event import RunEventRow
|
from deerflow.persistence.models.run_event import RunEventRow
|
||||||
from deerflow.runtime.events.store.base import RunEventStore
|
from deerflow.runtime.events.store.base import RunEventStore
|
||||||
from deerflow.runtime.user_context import AUTO, _AutoSentinel, get_current_user, resolve_user_id
|
from deerflow.runtime.user_context import AUTO, _AutoSentinel, get_current_user, resolve_user_id
|
||||||
from deerflow.utils.time import coerce_iso
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
@@ -33,9 +32,7 @@ class DbRunEventStore(RunEventStore):
|
|||||||
d["metadata"] = d.pop("event_metadata", {})
|
d["metadata"] = d.pop("event_metadata", {})
|
||||||
val = d.get("created_at")
|
val = d.get("created_at")
|
||||||
if isinstance(val, datetime):
|
if isinstance(val, datetime):
|
||||||
# SQLite drops tzinfo on read despite ``DateTime(timezone=True)``;
|
d["created_at"] = val.isoformat()
|
||||||
# ``coerce_iso`` normalizes naive datetimes as UTC.
|
|
||||||
d["created_at"] = coerce_iso(val)
|
|
||||||
d.pop("id", None)
|
d.pop("id", None)
|
||||||
# Restore structured content that was JSON-serialized on write.
|
# Restore structured content that was JSON-serialized on write.
|
||||||
raw = d.get("content", "")
|
raw = d.get("content", "")
|
||||||
@@ -144,13 +141,10 @@ class DbRunEventStore(RunEventStore):
|
|||||||
async def put_batch(self, events):
|
async def put_batch(self, events):
|
||||||
if not events:
|
if not events:
|
||||||
return []
|
return []
|
||||||
thread_ids = {e["thread_id"] for e in events}
|
|
||||||
if len(thread_ids) > 1:
|
|
||||||
raise ValueError(f"put_batch requires all events to belong to the same thread; got {thread_ids!r}")
|
|
||||||
user_id = self._user_id_from_context()
|
user_id = self._user_id_from_context()
|
||||||
async with self._sf() as session:
|
async with self._sf() as session:
|
||||||
async with session.begin():
|
async with session.begin():
|
||||||
# All events belong to the same thread (validated above).
|
# Get max seq for the thread (assume all events in batch belong to same thread).
|
||||||
thread_id = events[0]["thread_id"]
|
thread_id = events[0]["thread_id"]
|
||||||
max_seq = await self._max_seq_for_thread(session, thread_id)
|
max_seq = await self._max_seq_for_thread(session, thread_id)
|
||||||
seq = max_seq or 0
|
seq = max_seq or 0
|
||||||
|
|||||||
@@ -6,15 +6,6 @@ Each run's events are stored in a single file:
|
|||||||
All categories (message, trace, lifecycle) are in the same file.
|
All categories (message, trace, lifecycle) are in the same file.
|
||||||
This backend is suitable for lightweight single-node deployments.
|
This backend is suitable for lightweight single-node deployments.
|
||||||
|
|
||||||
**Single-process guarantee**: the in-memory seq counter is process-local.
|
|
||||||
Multi-process deployments sharing the same directory will produce duplicate
|
|
||||||
or non-monotonic seq values. Use ``DbRunEventStore`` for multi-process or
|
|
||||||
high-concurrency deployments.
|
|
||||||
|
|
||||||
File I/O is offloaded to a thread pool via ``asyncio.to_thread`` so the
|
|
||||||
event loop is never blocked. Per-thread ``asyncio.Lock`` objects serialise
|
|
||||||
writes within a single process to prevent interleaved JSONL lines.
|
|
||||||
|
|
||||||
Known trade-off: ``list_messages()`` must scan all run files for a
|
Known trade-off: ``list_messages()`` must scan all run files for a
|
||||||
thread since messages from multiple runs need unified seq ordering.
|
thread since messages from multiple runs need unified seq ordering.
|
||||||
``list_events()`` reads only one file -- the fast path.
|
``list_events()`` reads only one file -- the fast path.
|
||||||
@@ -22,7 +13,6 @@ thread since messages from multiple runs need unified seq ordering.
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import json
|
import json
|
||||||
import logging
|
import logging
|
||||||
import re
|
import re
|
||||||
@@ -40,11 +30,6 @@ class JsonlRunEventStore(RunEventStore):
|
|||||||
def __init__(self, base_dir: str | Path | None = None):
|
def __init__(self, base_dir: str | Path | None = None):
|
||||||
self._base_dir = Path(base_dir) if base_dir else Path(".deer-flow")
|
self._base_dir = Path(base_dir) if base_dir else Path(".deer-flow")
|
||||||
self._seq_counters: dict[str, int] = {} # thread_id -> current max seq
|
self._seq_counters: dict[str, int] = {} # thread_id -> current max seq
|
||||||
# Per-thread asyncio.Lock — serialises concurrent writes within one process.
|
|
||||||
self._write_locks: dict[str, asyncio.Lock] = {}
|
|
||||||
|
|
||||||
def _get_write_lock(self, thread_id: str) -> asyncio.Lock:
|
|
||||||
return self._write_locks.setdefault(thread_id, asyncio.Lock())
|
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def _validate_id(value: str, label: str) -> str:
|
def _validate_id(value: str, label: str) -> str:
|
||||||
@@ -65,8 +50,10 @@ class JsonlRunEventStore(RunEventStore):
|
|||||||
self._seq_counters[thread_id] = self._seq_counters.get(thread_id, 0) + 1
|
self._seq_counters[thread_id] = self._seq_counters.get(thread_id, 0) + 1
|
||||||
return self._seq_counters[thread_id]
|
return self._seq_counters[thread_id]
|
||||||
|
|
||||||
def _compute_max_seq(self, thread_id: str) -> int:
|
def _ensure_seq_loaded(self, thread_id: str) -> None:
|
||||||
"""Scan all run files for a thread and return the current max seq (blocking I/O)."""
|
"""Load max seq from existing files if not yet cached."""
|
||||||
|
if thread_id in self._seq_counters:
|
||||||
|
return
|
||||||
max_seq = 0
|
max_seq = 0
|
||||||
thread_dir = self._thread_dir(thread_id)
|
thread_dir = self._thread_dir(thread_id)
|
||||||
if thread_dir.exists():
|
if thread_dir.exists():
|
||||||
@@ -77,13 +64,7 @@ class JsonlRunEventStore(RunEventStore):
|
|||||||
max_seq = max(max_seq, record.get("seq", 0))
|
max_seq = max(max_seq, record.get("seq", 0))
|
||||||
except json.JSONDecodeError:
|
except json.JSONDecodeError:
|
||||||
logger.debug("Skipping malformed JSONL line in %s", f)
|
logger.debug("Skipping malformed JSONL line in %s", f)
|
||||||
return max_seq
|
continue
|
||||||
|
|
||||||
async def _ensure_seq_loaded(self, thread_id: str) -> None:
|
|
||||||
"""Load max seq from existing files into the in-memory counter (non-blocking)."""
|
|
||||||
if thread_id in self._seq_counters:
|
|
||||||
return
|
|
||||||
max_seq = await asyncio.to_thread(self._compute_max_seq, thread_id)
|
|
||||||
self._seq_counters[thread_id] = max_seq
|
self._seq_counters[thread_id] = max_seq
|
||||||
|
|
||||||
def _write_record(self, record: dict) -> None:
|
def _write_record(self, record: dict) -> None:
|
||||||
@@ -93,7 +74,7 @@ class JsonlRunEventStore(RunEventStore):
|
|||||||
f.write(json.dumps(record, default=str, ensure_ascii=False) + "\n")
|
f.write(json.dumps(record, default=str, ensure_ascii=False) + "\n")
|
||||||
|
|
||||||
def _read_thread_events(self, thread_id: str) -> list[dict]:
|
def _read_thread_events(self, thread_id: str) -> list[dict]:
|
||||||
"""Read all events for a thread, sorted by seq (blocking I/O)."""
|
"""Read all events for a thread, sorted by seq."""
|
||||||
events = []
|
events = []
|
||||||
thread_dir = self._thread_dir(thread_id)
|
thread_dir = self._thread_dir(thread_id)
|
||||||
if not thread_dir.exists():
|
if not thread_dir.exists():
|
||||||
@@ -106,11 +87,12 @@ class JsonlRunEventStore(RunEventStore):
|
|||||||
events.append(json.loads(line))
|
events.append(json.loads(line))
|
||||||
except json.JSONDecodeError:
|
except json.JSONDecodeError:
|
||||||
logger.debug("Skipping malformed JSONL line in %s", f)
|
logger.debug("Skipping malformed JSONL line in %s", f)
|
||||||
|
continue
|
||||||
events.sort(key=lambda e: e.get("seq", 0))
|
events.sort(key=lambda e: e.get("seq", 0))
|
||||||
return events
|
return events
|
||||||
|
|
||||||
def _read_run_events(self, thread_id: str, run_id: str) -> list[dict]:
|
def _read_run_events(self, thread_id: str, run_id: str) -> list[dict]:
|
||||||
"""Read events for a specific run file (blocking I/O)."""
|
"""Read events for a specific run file."""
|
||||||
path = self._run_file(thread_id, run_id)
|
path = self._run_file(thread_id, run_id)
|
||||||
if not path.exists():
|
if not path.exists():
|
||||||
return []
|
return []
|
||||||
@@ -122,36 +104,25 @@ class JsonlRunEventStore(RunEventStore):
|
|||||||
events.append(json.loads(line))
|
events.append(json.loads(line))
|
||||||
except json.JSONDecodeError:
|
except json.JSONDecodeError:
|
||||||
logger.debug("Skipping malformed JSONL line in %s", path)
|
logger.debug("Skipping malformed JSONL line in %s", path)
|
||||||
|
continue
|
||||||
events.sort(key=lambda e: e.get("seq", 0))
|
events.sort(key=lambda e: e.get("seq", 0))
|
||||||
return events
|
return events
|
||||||
|
|
||||||
def _delete_thread_files(self, thread_id: str) -> None:
|
|
||||||
thread_dir = self._thread_dir(thread_id)
|
|
||||||
if thread_dir.exists():
|
|
||||||
for f in thread_dir.glob("*.jsonl"):
|
|
||||||
f.unlink()
|
|
||||||
|
|
||||||
def _delete_run_file(self, thread_id: str, run_id: str) -> None:
|
|
||||||
path = self._run_file(thread_id, run_id)
|
|
||||||
if path.exists():
|
|
||||||
path.unlink()
|
|
||||||
|
|
||||||
async def put(self, *, thread_id, run_id, event_type, category, content="", metadata=None, created_at=None):
|
async def put(self, *, thread_id, run_id, event_type, category, content="", metadata=None, created_at=None):
|
||||||
async with self._get_write_lock(thread_id):
|
self._ensure_seq_loaded(thread_id)
|
||||||
await self._ensure_seq_loaded(thread_id)
|
seq = self._next_seq(thread_id)
|
||||||
seq = self._next_seq(thread_id)
|
record = {
|
||||||
record = {
|
"thread_id": thread_id,
|
||||||
"thread_id": thread_id,
|
"run_id": run_id,
|
||||||
"run_id": run_id,
|
"event_type": event_type,
|
||||||
"event_type": event_type,
|
"category": category,
|
||||||
"category": category,
|
"content": content,
|
||||||
"content": content,
|
"metadata": metadata or {},
|
||||||
"metadata": metadata or {},
|
"seq": seq,
|
||||||
"seq": seq,
|
"created_at": created_at or datetime.now(UTC).isoformat(),
|
||||||
"created_at": created_at or datetime.now(UTC).isoformat(),
|
}
|
||||||
}
|
self._write_record(record)
|
||||||
await asyncio.to_thread(self._write_record, record)
|
return record
|
||||||
return record
|
|
||||||
|
|
||||||
async def put_batch(self, events):
|
async def put_batch(self, events):
|
||||||
if not events:
|
if not events:
|
||||||
@@ -163,7 +134,7 @@ class JsonlRunEventStore(RunEventStore):
|
|||||||
return results
|
return results
|
||||||
|
|
||||||
async def list_messages(self, thread_id, *, limit=50, before_seq=None, after_seq=None):
|
async def list_messages(self, thread_id, *, limit=50, before_seq=None, after_seq=None):
|
||||||
all_events = await asyncio.to_thread(self._read_thread_events, thread_id)
|
all_events = self._read_thread_events(thread_id)
|
||||||
messages = [e for e in all_events if e.get("category") == "message"]
|
messages = [e for e in all_events if e.get("category") == "message"]
|
||||||
|
|
||||||
if before_seq is not None:
|
if before_seq is not None:
|
||||||
@@ -176,13 +147,13 @@ class JsonlRunEventStore(RunEventStore):
|
|||||||
return messages[-limit:]
|
return messages[-limit:]
|
||||||
|
|
||||||
async def list_events(self, thread_id, run_id, *, event_types=None, limit=500):
|
async def list_events(self, thread_id, run_id, *, event_types=None, limit=500):
|
||||||
events = await asyncio.to_thread(self._read_run_events, thread_id, run_id)
|
events = self._read_run_events(thread_id, run_id)
|
||||||
if event_types is not None:
|
if event_types is not None:
|
||||||
events = [e for e in events if e.get("event_type") in event_types]
|
events = [e for e in events if e.get("event_type") in event_types]
|
||||||
return events[:limit]
|
return events[:limit]
|
||||||
|
|
||||||
async def list_messages_by_run(self, thread_id, run_id, *, limit=50, before_seq=None, after_seq=None):
|
async def list_messages_by_run(self, thread_id, run_id, *, limit=50, before_seq=None, after_seq=None):
|
||||||
events = await asyncio.to_thread(self._read_run_events, thread_id, run_id)
|
events = self._read_run_events(thread_id, run_id)
|
||||||
filtered = [e for e in events if e.get("category") == "message"]
|
filtered = [e for e in events if e.get("category") == "message"]
|
||||||
if before_seq is not None:
|
if before_seq is not None:
|
||||||
filtered = [e for e in filtered if e.get("seq", 0) < before_seq]
|
filtered = [e for e in filtered if e.get("seq", 0) < before_seq]
|
||||||
@@ -194,25 +165,23 @@ class JsonlRunEventStore(RunEventStore):
|
|||||||
return filtered[-limit:] if len(filtered) > limit else filtered
|
return filtered[-limit:] if len(filtered) > limit else filtered
|
||||||
|
|
||||||
async def count_messages(self, thread_id):
|
async def count_messages(self, thread_id):
|
||||||
all_events = await asyncio.to_thread(self._read_thread_events, thread_id)
|
all_events = self._read_thread_events(thread_id)
|
||||||
return sum(1 for e in all_events if e.get("category") == "message")
|
return sum(1 for e in all_events if e.get("category") == "message")
|
||||||
|
|
||||||
async def delete_by_thread(self, thread_id):
|
async def delete_by_thread(self, thread_id):
|
||||||
async with self._get_write_lock(thread_id):
|
all_events = self._read_thread_events(thread_id)
|
||||||
all_events = await asyncio.to_thread(self._read_thread_events, thread_id)
|
count = len(all_events)
|
||||||
count = len(all_events)
|
thread_dir = self._thread_dir(thread_id)
|
||||||
await asyncio.to_thread(self._delete_thread_files, thread_id)
|
if thread_dir.exists():
|
||||||
self._seq_counters.pop(thread_id, None)
|
for f in thread_dir.glob("*.jsonl"):
|
||||||
# Pop the lock inside the held scope to minimise the window where a new caller
|
f.unlink()
|
||||||
# could obtain a fresh lock while a waiting coroutine still holds the old one.
|
self._seq_counters.pop(thread_id, None)
|
||||||
# Note: coroutines that already acquired a reference to this lock before the
|
return count
|
||||||
# delete will still proceed after we release — this is an accepted narrow race.
|
|
||||||
self._write_locks.pop(thread_id, None)
|
|
||||||
return count
|
|
||||||
|
|
||||||
async def delete_by_run(self, thread_id, run_id):
|
async def delete_by_run(self, thread_id, run_id):
|
||||||
async with self._get_write_lock(thread_id):
|
events = self._read_run_events(thread_id, run_id)
|
||||||
events = await asyncio.to_thread(self._read_run_events, thread_id, run_id)
|
count = len(events)
|
||||||
count = len(events)
|
path = self._run_file(thread_id, run_id)
|
||||||
await asyncio.to_thread(self._delete_run_file, thread_id, run_id)
|
if path.exists():
|
||||||
return count
|
path.unlink()
|
||||||
|
return count
|
||||||
|
|||||||
@@ -20,7 +20,7 @@ from __future__ import annotations
|
|||||||
import asyncio
|
import asyncio
|
||||||
import logging
|
import logging
|
||||||
import time
|
import time
|
||||||
from collections.abc import Awaitable, Callable, Mapping
|
from collections.abc import Mapping
|
||||||
from datetime import UTC, datetime
|
from datetime import UTC, datetime
|
||||||
from typing import TYPE_CHECKING, Any, cast
|
from typing import TYPE_CHECKING, Any, cast
|
||||||
from uuid import UUID
|
from uuid import UUID
|
||||||
@@ -46,8 +46,6 @@ class RunJournal(BaseCallbackHandler):
|
|||||||
*,
|
*,
|
||||||
track_token_usage: bool = True,
|
track_token_usage: bool = True,
|
||||||
flush_threshold: int = 20,
|
flush_threshold: int = 20,
|
||||||
progress_reporter: Callable[[dict], Awaitable[None]] | None = None,
|
|
||||||
progress_flush_interval: float = 5.0,
|
|
||||||
):
|
):
|
||||||
super().__init__()
|
super().__init__()
|
||||||
self.run_id = run_id
|
self.run_id = run_id
|
||||||
@@ -55,16 +53,10 @@ class RunJournal(BaseCallbackHandler):
|
|||||||
self._store = event_store
|
self._store = event_store
|
||||||
self._track_tokens = track_token_usage
|
self._track_tokens = track_token_usage
|
||||||
self._flush_threshold = flush_threshold
|
self._flush_threshold = flush_threshold
|
||||||
self._progress_reporter = progress_reporter
|
|
||||||
self._progress_flush_interval = progress_flush_interval
|
|
||||||
|
|
||||||
# Write buffer
|
# Write buffer
|
||||||
self._buffer: list[dict] = []
|
self._buffer: list[dict] = []
|
||||||
self._pending_flush_tasks: set[asyncio.Task[None]] = set()
|
self._pending_flush_tasks: set[asyncio.Task[None]] = set()
|
||||||
self._pending_progress_task: asyncio.Task[None] | None = None
|
|
||||||
self._pending_progress_delayed = False
|
|
||||||
self._progress_dirty = False
|
|
||||||
self._last_progress_flush = 0.0
|
|
||||||
|
|
||||||
# Token accumulators
|
# Token accumulators
|
||||||
self._total_input_tokens = 0
|
self._total_input_tokens = 0
|
||||||
@@ -302,8 +294,6 @@ class RunJournal(BaseCallbackHandler):
|
|||||||
else:
|
else:
|
||||||
self._lead_agent_tokens += total_tk
|
self._lead_agent_tokens += total_tk
|
||||||
|
|
||||||
self._schedule_progress_flush()
|
|
||||||
|
|
||||||
if messages:
|
if messages:
|
||||||
self._counted_message_llm_run_ids.add(str(run_id))
|
self._counted_message_llm_run_ids.add(str(run_id))
|
||||||
|
|
||||||
@@ -455,8 +445,6 @@ class RunJournal(BaseCallbackHandler):
|
|||||||
else:
|
else:
|
||||||
self._lead_agent_tokens += total_tk
|
self._lead_agent_tokens += total_tk
|
||||||
|
|
||||||
self._schedule_progress_flush()
|
|
||||||
|
|
||||||
def set_first_human_message(self, content: str) -> None:
|
def set_first_human_message(self, content: str) -> None:
|
||||||
"""Record the first human message for convenience fields."""
|
"""Record the first human message for convenience fields."""
|
||||||
self._first_human_msg = content[:2000] if content else None
|
self._first_human_msg = content[:2000] if content else None
|
||||||
@@ -486,14 +474,6 @@ class RunJournal(BaseCallbackHandler):
|
|||||||
"""Force flush remaining buffer. Called in worker's finally block."""
|
"""Force flush remaining buffer. Called in worker's finally block."""
|
||||||
if self._pending_flush_tasks:
|
if self._pending_flush_tasks:
|
||||||
await asyncio.gather(*tuple(self._pending_flush_tasks), return_exceptions=True)
|
await asyncio.gather(*tuple(self._pending_flush_tasks), return_exceptions=True)
|
||||||
while self._pending_progress_task is not None and not self._pending_progress_task.done():
|
|
||||||
if self._pending_progress_delayed:
|
|
||||||
self._pending_progress_task.cancel()
|
|
||||||
await asyncio.gather(self._pending_progress_task, return_exceptions=True)
|
|
||||||
self._progress_dirty = False
|
|
||||||
self._pending_progress_delayed = False
|
|
||||||
break
|
|
||||||
await asyncio.gather(self._pending_progress_task, return_exceptions=True)
|
|
||||||
|
|
||||||
while self._buffer:
|
while self._buffer:
|
||||||
batch = self._buffer[: self._flush_threshold]
|
batch = self._buffer[: self._flush_threshold]
|
||||||
@@ -504,57 +484,6 @@ class RunJournal(BaseCallbackHandler):
|
|||||||
self._buffer = batch + self._buffer
|
self._buffer = batch + self._buffer
|
||||||
raise
|
raise
|
||||||
|
|
||||||
def _schedule_progress_flush(self) -> None:
|
|
||||||
"""Best-effort throttled progress snapshot for active run visibility."""
|
|
||||||
if self._progress_reporter is None:
|
|
||||||
return
|
|
||||||
now = time.monotonic()
|
|
||||||
elapsed = now - self._last_progress_flush
|
|
||||||
if elapsed < self._progress_flush_interval:
|
|
||||||
self._progress_dirty = True
|
|
||||||
self._schedule_delayed_progress_flush(self._progress_flush_interval - elapsed)
|
|
||||||
return
|
|
||||||
if self._pending_progress_task is not None and not self._pending_progress_task.done():
|
|
||||||
self._progress_dirty = True
|
|
||||||
return
|
|
||||||
try:
|
|
||||||
loop = asyncio.get_running_loop()
|
|
||||||
except RuntimeError:
|
|
||||||
return
|
|
||||||
self._progress_dirty = False
|
|
||||||
self._pending_progress_task = loop.create_task(self._flush_progress_async(snapshot=self.get_completion_data()))
|
|
||||||
|
|
||||||
def _schedule_delayed_progress_flush(self, delay: float) -> None:
|
|
||||||
if self._pending_progress_task is not None and not self._pending_progress_task.done():
|
|
||||||
return
|
|
||||||
try:
|
|
||||||
loop = asyncio.get_running_loop()
|
|
||||||
except RuntimeError:
|
|
||||||
return
|
|
||||||
delay = max(0.0, delay)
|
|
||||||
self._pending_progress_delayed = delay > 0
|
|
||||||
self._pending_progress_task = loop.create_task(self._flush_progress_async(delay=delay))
|
|
||||||
|
|
||||||
async def _flush_progress_async(self, *, snapshot: dict | None = None, delay: float = 0.0) -> None:
|
|
||||||
if self._progress_reporter is None:
|
|
||||||
return
|
|
||||||
if delay > 0:
|
|
||||||
self._pending_progress_delayed = True
|
|
||||||
await asyncio.sleep(delay)
|
|
||||||
self._pending_progress_delayed = False
|
|
||||||
dirty_before_write = self._progress_dirty
|
|
||||||
self._progress_dirty = False
|
|
||||||
snapshot_to_write = snapshot or self.get_completion_data()
|
|
||||||
try:
|
|
||||||
await self._progress_reporter(snapshot_to_write)
|
|
||||||
self._last_progress_flush = time.monotonic()
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to persist progress snapshot for run %s", self.run_id, exc_info=True)
|
|
||||||
if dirty_before_write or self._progress_dirty:
|
|
||||||
self._progress_dirty = False
|
|
||||||
self._pending_progress_task = None
|
|
||||||
self._schedule_delayed_progress_flush(self._progress_flush_interval)
|
|
||||||
|
|
||||||
def get_completion_data(self) -> dict:
|
def get_completion_data(self) -> dict:
|
||||||
"""Return accumulated token and message data for run completion."""
|
"""Return accumulated token and message data for run completion."""
|
||||||
return {
|
return {
|
||||||
|
|||||||
@@ -1,39 +1,16 @@
|
|||||||
"""Run lifecycle management for LangGraph Platform API compatibility."""
|
"""Run lifecycle management for LangGraph Platform API compatibility."""
|
||||||
|
|
||||||
from .domain import (
|
|
||||||
AssistantId,
|
|
||||||
CancelAction,
|
|
||||||
DisconnectMode,
|
|
||||||
EventSeq,
|
|
||||||
InvalidRunTransition,
|
|
||||||
MultitaskStrategy,
|
|
||||||
Run,
|
|
||||||
RunId,
|
|
||||||
RunScope,
|
|
||||||
RunStatus,
|
|
||||||
ThreadId,
|
|
||||||
UserId,
|
|
||||||
)
|
|
||||||
from .manager import ConflictError, RunManager, RunRecord, UnsupportedStrategyError
|
from .manager import ConflictError, RunManager, RunRecord, UnsupportedStrategyError
|
||||||
|
from .schemas import DisconnectMode, RunStatus
|
||||||
from .worker import RunContext, run_agent
|
from .worker import RunContext, run_agent
|
||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
"AssistantId",
|
|
||||||
"CancelAction",
|
|
||||||
"ConflictError",
|
"ConflictError",
|
||||||
"DisconnectMode",
|
"DisconnectMode",
|
||||||
"EventSeq",
|
|
||||||
"InvalidRunTransition",
|
|
||||||
"MultitaskStrategy",
|
|
||||||
"Run",
|
|
||||||
"RunContext",
|
"RunContext",
|
||||||
"RunId",
|
|
||||||
"RunManager",
|
"RunManager",
|
||||||
"RunRecord",
|
"RunRecord",
|
||||||
"RunScope",
|
|
||||||
"RunStatus",
|
"RunStatus",
|
||||||
"ThreadId",
|
|
||||||
"UnsupportedStrategyError",
|
"UnsupportedStrategyError",
|
||||||
"UserId",
|
|
||||||
"run_agent",
|
"run_agent",
|
||||||
]
|
]
|
||||||
|
|||||||
@@ -1,20 +0,0 @@
|
|||||||
"""Application-layer DTOs and services for run runtime use cases."""
|
|
||||||
|
|
||||||
from .commands import CancelRunCommand, CreateRunCommand, JoinRunStreamCommand
|
|
||||||
from .dto import RunMessageView, RunSnapshot, RunStreamHandle, StoredRunEvent
|
|
||||||
from .queries import GetRunQuery, ListRunMessagesQuery, ListRunsQuery
|
|
||||||
from .services import RunsApplicationService
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"CancelRunCommand",
|
|
||||||
"CreateRunCommand",
|
|
||||||
"GetRunQuery",
|
|
||||||
"JoinRunStreamCommand",
|
|
||||||
"ListRunMessagesQuery",
|
|
||||||
"ListRunsQuery",
|
|
||||||
"RunMessageView",
|
|
||||||
"RunSnapshot",
|
|
||||||
"RunStreamHandle",
|
|
||||||
"RunsApplicationService",
|
|
||||||
"StoredRunEvent",
|
|
||||||
]
|
|
||||||
@@ -1,46 +0,0 @@
|
|||||||
"""Application command DTOs for run use cases."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from dataclasses import dataclass, field
|
|
||||||
from typing import Any, Literal
|
|
||||||
|
|
||||||
from ..domain import AssistantId, CancelAction, DisconnectMode, MultitaskStrategy, RunId, RunScope, ThreadId
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class CreateRunCommand:
|
|
||||||
thread_id: ThreadId
|
|
||||||
assistant_id: AssistantId | None = None
|
|
||||||
input: dict[str, Any] | None = None
|
|
||||||
command: dict[str, Any] | None = None
|
|
||||||
metadata: dict[str, Any] = field(default_factory=dict)
|
|
||||||
config: dict[str, Any] = field(default_factory=dict)
|
|
||||||
context: dict[str, Any] = field(default_factory=dict)
|
|
||||||
scope: RunScope = RunScope.stateful
|
|
||||||
on_disconnect: DisconnectMode = DisconnectMode.cancel
|
|
||||||
multitask_strategy: MultitaskStrategy = MultitaskStrategy.reject
|
|
||||||
stream_mode: list[str] | str | None = None
|
|
||||||
stream_subgraphs: bool = False
|
|
||||||
interrupt_before: list[str] | Literal["*"] | None = None
|
|
||||||
interrupt_after: list[str] | Literal["*"] | None = None
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class CancelRunCommand:
|
|
||||||
run_id: RunId
|
|
||||||
action: CancelAction = CancelAction.interrupt
|
|
||||||
wait: bool = False
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class JoinRunStreamCommand:
|
|
||||||
run_id: RunId
|
|
||||||
last_event_id: str | None = None
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"CancelRunCommand",
|
|
||||||
"CreateRunCommand",
|
|
||||||
"JoinRunStreamCommand",
|
|
||||||
]
|
|
||||||
@@ -1,76 +0,0 @@
|
|||||||
"""Application output DTOs for run use cases."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from collections.abc import AsyncIterator
|
|
||||||
from dataclasses import dataclass, field
|
|
||||||
from typing import Any
|
|
||||||
|
|
||||||
from ..domain import AssistantId, EventSeq, Run, RunId, RunStatus, ThreadId
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class RunSnapshot:
|
|
||||||
run_id: RunId
|
|
||||||
thread_id: ThreadId
|
|
||||||
assistant_id: AssistantId | None = None
|
|
||||||
status: RunStatus = RunStatus.pending
|
|
||||||
metadata: dict[str, Any] = field(default_factory=dict)
|
|
||||||
kwargs: dict[str, Any] = field(default_factory=dict)
|
|
||||||
created_at: str = ""
|
|
||||||
updated_at: str = ""
|
|
||||||
error: str | None = None
|
|
||||||
model_name: str | None = None
|
|
||||||
|
|
||||||
@classmethod
|
|
||||||
def from_run(cls, run: Run) -> RunSnapshot:
|
|
||||||
return cls(
|
|
||||||
run_id=run.run_id,
|
|
||||||
thread_id=run.thread_id,
|
|
||||||
assistant_id=run.assistant_id,
|
|
||||||
status=run.status,
|
|
||||||
metadata=dict(run.metadata),
|
|
||||||
kwargs=dict(run.kwargs),
|
|
||||||
created_at=run.created_at,
|
|
||||||
updated_at=run.updated_at,
|
|
||||||
error=run.error,
|
|
||||||
model_name=run.model_name,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class RunMessageView:
|
|
||||||
thread_id: ThreadId
|
|
||||||
run_id: RunId
|
|
||||||
seq: EventSeq
|
|
||||||
event_type: str
|
|
||||||
content: str | dict[str, Any] = ""
|
|
||||||
metadata: dict[str, Any] = field(default_factory=dict)
|
|
||||||
created_at: str = ""
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class StoredRunEvent:
|
|
||||||
thread_id: ThreadId
|
|
||||||
run_id: RunId
|
|
||||||
seq: EventSeq
|
|
||||||
event_type: str
|
|
||||||
category: str
|
|
||||||
content: str | dict[str, Any] = ""
|
|
||||||
metadata: dict[str, Any] = field(default_factory=dict)
|
|
||||||
created_at: str = ""
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class RunStreamHandle:
|
|
||||||
run_id: RunId
|
|
||||||
thread_id: ThreadId
|
|
||||||
events: AsyncIterator[Any]
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"RunMessageView",
|
|
||||||
"RunSnapshot",
|
|
||||||
"RunStreamHandle",
|
|
||||||
"StoredRunEvent",
|
|
||||||
]
|
|
||||||
@@ -1,37 +0,0 @@
|
|||||||
"""Application query DTOs for run use cases."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from dataclasses import dataclass
|
|
||||||
|
|
||||||
from ..domain import RunId, ThreadId, UserId
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class GetRunQuery:
|
|
||||||
run_id: RunId
|
|
||||||
thread_id: ThreadId | None = None
|
|
||||||
user_id: UserId | None = None
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class ListRunsQuery:
|
|
||||||
thread_id: ThreadId
|
|
||||||
user_id: UserId | None = None
|
|
||||||
limit: int = 100
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class ListRunMessagesQuery:
|
|
||||||
thread_id: ThreadId
|
|
||||||
run_id: RunId
|
|
||||||
limit: int = 50
|
|
||||||
before_seq: int | None = None
|
|
||||||
after_seq: int | None = None
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"GetRunQuery",
|
|
||||||
"ListRunMessagesQuery",
|
|
||||||
"ListRunsQuery",
|
|
||||||
]
|
|
||||||
@@ -1,74 +0,0 @@
|
|||||||
"""Application service skeleton for run use cases."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from dataclasses import dataclass
|
|
||||||
|
|
||||||
from ..execution import RunExecutionScheduler, RunSupervisor
|
|
||||||
from ..repositories import RunEventLog, RunRepository
|
|
||||||
from ..streams import RunStreamBroker
|
|
||||||
from .commands import CancelRunCommand, CreateRunCommand, JoinRunStreamCommand
|
|
||||||
from .dto import RunMessageView, RunSnapshot, RunStreamHandle
|
|
||||||
from .queries import GetRunQuery, ListRunMessagesQuery, ListRunsQuery
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
|
||||||
class RunsApplicationService:
|
|
||||||
"""Use-case orchestration boundary for run runtime operations.
|
|
||||||
|
|
||||||
PR1 only introduces the boundary and dependency shape. Existing Gateway
|
|
||||||
handlers continue to call the legacy service functions until later PRs move
|
|
||||||
behavior into this class.
|
|
||||||
"""
|
|
||||||
|
|
||||||
run_repository: RunRepository
|
|
||||||
run_event_log: RunEventLog
|
|
||||||
stream_broker: RunStreamBroker
|
|
||||||
scheduler: RunExecutionScheduler
|
|
||||||
supervisor: RunSupervisor
|
|
||||||
|
|
||||||
async def create_background(self, command: CreateRunCommand) -> RunSnapshot:
|
|
||||||
# PR1 defines the application boundary; later PRs move Gateway runtime
|
|
||||||
# behavior behind this method.
|
|
||||||
raise NotImplementedError("RunsApplicationService is not wired in PR1")
|
|
||||||
|
|
||||||
async def create_and_stream(self, command: CreateRunCommand) -> RunStreamHandle:
|
|
||||||
raise NotImplementedError("RunsApplicationService is not wired in PR1")
|
|
||||||
|
|
||||||
async def create_and_wait(self, command: CreateRunCommand) -> RunSnapshot:
|
|
||||||
raise NotImplementedError("RunsApplicationService is not wired in PR1")
|
|
||||||
|
|
||||||
async def join_stream(self, command: JoinRunStreamCommand) -> RunStreamHandle:
|
|
||||||
raise NotImplementedError("RunsApplicationService is not wired in PR1")
|
|
||||||
|
|
||||||
async def cancel(self, command: CancelRunCommand) -> bool:
|
|
||||||
return await self.supervisor.cancel(command.run_id, action=command.action)
|
|
||||||
|
|
||||||
async def get_run(self, query: GetRunQuery) -> RunSnapshot | None:
|
|
||||||
run = await self.run_repository.get(query.run_id, user_id=query.user_id)
|
|
||||||
if run is None:
|
|
||||||
return None
|
|
||||||
if query.thread_id is not None and run.thread_id != query.thread_id:
|
|
||||||
return None
|
|
||||||
return RunSnapshot.from_run(run)
|
|
||||||
|
|
||||||
async def list_runs(self, query: ListRunsQuery) -> list[RunSnapshot]:
|
|
||||||
return await self.run_repository.list_by_thread(
|
|
||||||
query.thread_id,
|
|
||||||
user_id=query.user_id,
|
|
||||||
limit=query.limit,
|
|
||||||
)
|
|
||||||
|
|
||||||
async def list_run_messages(self, query: ListRunMessagesQuery) -> list[RunMessageView]:
|
|
||||||
return await self.run_event_log.list_messages_by_run(
|
|
||||||
query.thread_id,
|
|
||||||
query.run_id,
|
|
||||||
limit=query.limit,
|
|
||||||
before_seq=query.before_seq,
|
|
||||||
after_seq=query.after_seq,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"RunsApplicationService",
|
|
||||||
]
|
|
||||||
@@ -1,33 +0,0 @@
|
|||||||
"""Run runtime domain model."""
|
|
||||||
|
|
||||||
from .errors import InvalidRunTransition, RunDomainError
|
|
||||||
from .events import RunCancelled, RunCompleted, RunCreated, RunEvent, RunFailed, RunStarted
|
|
||||||
from .identifiers import AssistantId, RunId, ThreadId, UserId
|
|
||||||
from .model import Run
|
|
||||||
from .policies import CancelPolicy, MultitaskDecision, MultitaskPolicy
|
|
||||||
from .value_objects import CancelAction, DisconnectMode, EventSeq, MultitaskStrategy, RunScope, RunStatus
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"AssistantId",
|
|
||||||
"CancelAction",
|
|
||||||
"CancelPolicy",
|
|
||||||
"DisconnectMode",
|
|
||||||
"EventSeq",
|
|
||||||
"InvalidRunTransition",
|
|
||||||
"MultitaskDecision",
|
|
||||||
"MultitaskPolicy",
|
|
||||||
"MultitaskStrategy",
|
|
||||||
"Run",
|
|
||||||
"RunCancelled",
|
|
||||||
"RunCompleted",
|
|
||||||
"RunCreated",
|
|
||||||
"RunDomainError",
|
|
||||||
"RunEvent",
|
|
||||||
"RunFailed",
|
|
||||||
"RunId",
|
|
||||||
"RunScope",
|
|
||||||
"RunStarted",
|
|
||||||
"RunStatus",
|
|
||||||
"ThreadId",
|
|
||||||
"UserId",
|
|
||||||
]
|
|
||||||
@@ -1,24 +0,0 @@
|
|||||||
"""Domain-level errors for run lifecycle operations."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from .value_objects import RunStatus
|
|
||||||
|
|
||||||
|
|
||||||
class RunDomainError(Exception):
|
|
||||||
"""Base class for run runtime domain errors."""
|
|
||||||
|
|
||||||
|
|
||||||
class InvalidRunTransition(RunDomainError):
|
|
||||||
"""Raised when a run status transition violates lifecycle rules."""
|
|
||||||
|
|
||||||
def __init__(self, current: RunStatus, target: RunStatus) -> None:
|
|
||||||
super().__init__(f"Cannot transition run from {current.value!r} to {target.value!r}")
|
|
||||||
self.current = current
|
|
||||||
self.target = target
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"InvalidRunTransition",
|
|
||||||
"RunDomainError",
|
|
||||||
]
|
|
||||||
@@ -1,64 +0,0 @@
|
|||||||
"""Domain events emitted by the run aggregate."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from dataclasses import dataclass, field
|
|
||||||
from typing import Any
|
|
||||||
|
|
||||||
from deerflow.utils.time import now_iso
|
|
||||||
|
|
||||||
from .identifiers import AssistantId, RunId, ThreadId
|
|
||||||
from .value_objects import CancelAction, RunStatus
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class RunCreated:
|
|
||||||
run_id: RunId
|
|
||||||
thread_id: ThreadId
|
|
||||||
occurred_at: str = field(default_factory=now_iso)
|
|
||||||
assistant_id: AssistantId | None = None
|
|
||||||
metadata: dict[str, Any] = field(default_factory=dict)
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class RunStarted:
|
|
||||||
run_id: RunId
|
|
||||||
thread_id: ThreadId
|
|
||||||
occurred_at: str = field(default_factory=now_iso)
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class RunCompleted:
|
|
||||||
run_id: RunId
|
|
||||||
thread_id: ThreadId
|
|
||||||
occurred_at: str = field(default_factory=now_iso)
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class RunFailed:
|
|
||||||
run_id: RunId
|
|
||||||
thread_id: ThreadId
|
|
||||||
status: RunStatus
|
|
||||||
occurred_at: str = field(default_factory=now_iso)
|
|
||||||
error: str | None = None
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class RunCancelled:
|
|
||||||
run_id: RunId
|
|
||||||
thread_id: ThreadId
|
|
||||||
occurred_at: str = field(default_factory=now_iso)
|
|
||||||
action: CancelAction = CancelAction.interrupt
|
|
||||||
|
|
||||||
|
|
||||||
RunEvent = RunCreated | RunStarted | RunCompleted | RunFailed | RunCancelled
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"RunCancelled",
|
|
||||||
"RunCompleted",
|
|
||||||
"RunCreated",
|
|
||||||
"RunEvent",
|
|
||||||
"RunFailed",
|
|
||||||
"RunStarted",
|
|
||||||
]
|
|
||||||
@@ -1,27 +0,0 @@
|
|||||||
"""Lightweight identifiers for the run runtime domain."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from typing import NewType
|
|
||||||
|
|
||||||
RunId = NewType("RunId", str)
|
|
||||||
ThreadId = NewType("ThreadId", str)
|
|
||||||
AssistantId = NewType("AssistantId", str)
|
|
||||||
UserId = NewType("UserId", str)
|
|
||||||
|
|
||||||
|
|
||||||
def require_non_empty(value: str, *, field_name: str) -> str:
|
|
||||||
"""Return a stripped identifier value, rejecting empty identifiers."""
|
|
||||||
normalized = value.strip()
|
|
||||||
if not normalized:
|
|
||||||
raise ValueError(f"{field_name} must not be empty")
|
|
||||||
return normalized
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"AssistantId",
|
|
||||||
"RunId",
|
|
||||||
"ThreadId",
|
|
||||||
"UserId",
|
|
||||||
"require_non_empty",
|
|
||||||
]
|
|
||||||
@@ -1,193 +0,0 @@
|
|||||||
"""Run aggregate root and lifecycle invariants."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from dataclasses import dataclass, field
|
|
||||||
from typing import Any
|
|
||||||
|
|
||||||
from deerflow.utils.time import now_iso
|
|
||||||
|
|
||||||
from .errors import InvalidRunTransition
|
|
||||||
from .events import RunCancelled, RunCompleted, RunCreated, RunEvent, RunFailed, RunStarted
|
|
||||||
from .identifiers import AssistantId, RunId, ThreadId, require_non_empty
|
|
||||||
from .value_objects import CancelAction, MultitaskStrategy, RunScope, RunStatus
|
|
||||||
|
|
||||||
# Keep lifecycle transitions explicit so later application code cannot invent
|
|
||||||
# ad hoc status moves outside the aggregate.
|
|
||||||
_ALLOWED_TRANSITIONS: dict[RunStatus, frozenset[RunStatus]] = {
|
|
||||||
RunStatus.pending: frozenset(
|
|
||||||
{
|
|
||||||
RunStatus.running,
|
|
||||||
RunStatus.error,
|
|
||||||
RunStatus.timeout,
|
|
||||||
RunStatus.interrupted,
|
|
||||||
}
|
|
||||||
),
|
|
||||||
RunStatus.running: frozenset(
|
|
||||||
{
|
|
||||||
RunStatus.success,
|
|
||||||
RunStatus.error,
|
|
||||||
RunStatus.timeout,
|
|
||||||
RunStatus.interrupted,
|
|
||||||
}
|
|
||||||
),
|
|
||||||
RunStatus.success: frozenset(),
|
|
||||||
RunStatus.error: frozenset(),
|
|
||||||
RunStatus.timeout: frozenset(),
|
|
||||||
RunStatus.interrupted: frozenset(),
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
|
||||||
class Run:
|
|
||||||
"""Run aggregate root.
|
|
||||||
|
|
||||||
The aggregate owns lifecycle invariants only. Infrastructure concerns such
|
|
||||||
as SQL sessions, SSE frames, Redis clients, and FastAPI requests stay out of
|
|
||||||
this model.
|
|
||||||
"""
|
|
||||||
|
|
||||||
run_id: RunId
|
|
||||||
thread_id: ThreadId
|
|
||||||
status: RunStatus
|
|
||||||
assistant_id: AssistantId | None = None
|
|
||||||
scope: RunScope = RunScope.stateful
|
|
||||||
multitask_strategy: MultitaskStrategy = MultitaskStrategy.reject
|
|
||||||
metadata: dict[str, Any] = field(default_factory=dict)
|
|
||||||
kwargs: dict[str, Any] = field(default_factory=dict)
|
|
||||||
created_at: str = field(default_factory=now_iso)
|
|
||||||
updated_at: str = field(default_factory=now_iso)
|
|
||||||
error: str | None = None
|
|
||||||
model_name: str | None = None
|
|
||||||
_pending_events: list[RunEvent] = field(default_factory=list, init=False, repr=False)
|
|
||||||
|
|
||||||
def __post_init__(self) -> None:
|
|
||||||
self.run_id = RunId(require_non_empty(str(self.run_id), field_name="run_id"))
|
|
||||||
self.thread_id = ThreadId(require_non_empty(str(self.thread_id), field_name="thread_id"))
|
|
||||||
if self.assistant_id is not None:
|
|
||||||
self.assistant_id = AssistantId(require_non_empty(str(self.assistant_id), field_name="assistant_id"))
|
|
||||||
|
|
||||||
@classmethod
|
|
||||||
def create(
|
|
||||||
cls,
|
|
||||||
*,
|
|
||||||
run_id: RunId,
|
|
||||||
thread_id: ThreadId,
|
|
||||||
assistant_id: AssistantId | None = None,
|
|
||||||
scope: RunScope = RunScope.stateful,
|
|
||||||
multitask_strategy: MultitaskStrategy = MultitaskStrategy.reject,
|
|
||||||
metadata: dict[str, Any] | None = None,
|
|
||||||
kwargs: dict[str, Any] | None = None,
|
|
||||||
model_name: str | None = None,
|
|
||||||
created_at: str | None = None,
|
|
||||||
) -> Run:
|
|
||||||
timestamp = created_at or now_iso()
|
|
||||||
run = cls(
|
|
||||||
run_id=run_id,
|
|
||||||
thread_id=thread_id,
|
|
||||||
assistant_id=assistant_id,
|
|
||||||
status=RunStatus.pending,
|
|
||||||
scope=scope,
|
|
||||||
multitask_strategy=multitask_strategy,
|
|
||||||
metadata=metadata or {},
|
|
||||||
kwargs=kwargs or {},
|
|
||||||
created_at=timestamp,
|
|
||||||
updated_at=timestamp,
|
|
||||||
model_name=model_name,
|
|
||||||
)
|
|
||||||
run._record_event(
|
|
||||||
RunCreated(
|
|
||||||
run_id=run.run_id,
|
|
||||||
thread_id=run.thread_id,
|
|
||||||
occurred_at=timestamp,
|
|
||||||
assistant_id=run.assistant_id,
|
|
||||||
metadata=dict(run.metadata),
|
|
||||||
)
|
|
||||||
)
|
|
||||||
return run
|
|
||||||
|
|
||||||
@property
|
|
||||||
def is_terminal(self) -> bool:
|
|
||||||
return not _ALLOWED_TRANSITIONS[self.status]
|
|
||||||
|
|
||||||
def pull_events(self) -> tuple[RunEvent, ...]:
|
|
||||||
# Domain events are drained by the application layer after the aggregate
|
|
||||||
# has accepted a state change.
|
|
||||||
events = tuple(self._pending_events)
|
|
||||||
self._pending_events.clear()
|
|
||||||
return events
|
|
||||||
|
|
||||||
def mark_started(self, *, at: str | None = None) -> None:
|
|
||||||
self._transition_to(RunStatus.running, at=at)
|
|
||||||
|
|
||||||
def mark_completed(self, *, at: str | None = None) -> None:
|
|
||||||
self._transition_to(RunStatus.success, at=at)
|
|
||||||
|
|
||||||
def mark_failed(self, error: str | None = None, *, at: str | None = None) -> None:
|
|
||||||
self._transition_to(RunStatus.error, error=error, at=at)
|
|
||||||
|
|
||||||
def mark_timed_out(self, error: str | None = None, *, at: str | None = None) -> None:
|
|
||||||
self._transition_to(RunStatus.timeout, error=error, at=at)
|
|
||||||
|
|
||||||
def mark_cancelled(self, *, action: CancelAction = CancelAction.interrupt, at: str | None = None) -> None:
|
|
||||||
self._transition_to(RunStatus.interrupted, action=action, at=at)
|
|
||||||
|
|
||||||
def _transition_to(
|
|
||||||
self,
|
|
||||||
target: RunStatus,
|
|
||||||
*,
|
|
||||||
error: str | None = None,
|
|
||||||
action: CancelAction = CancelAction.interrupt,
|
|
||||||
at: str | None = None,
|
|
||||||
) -> None:
|
|
||||||
if target == self.status:
|
|
||||||
return
|
|
||||||
if target not in _ALLOWED_TRANSITIONS[self.status]:
|
|
||||||
raise InvalidRunTransition(self.status, target)
|
|
||||||
|
|
||||||
timestamp = at or now_iso()
|
|
||||||
self.status = target
|
|
||||||
self.updated_at = timestamp
|
|
||||||
if error is not None:
|
|
||||||
self.error = error
|
|
||||||
self._record_event(self._event_for_transition(target, timestamp, error=error, action=action))
|
|
||||||
|
|
||||||
def _event_for_transition(
|
|
||||||
self,
|
|
||||||
target: RunStatus,
|
|
||||||
occurred_at: str,
|
|
||||||
*,
|
|
||||||
error: str | None,
|
|
||||||
action: CancelAction,
|
|
||||||
) -> RunEvent:
|
|
||||||
# Keep event construction next to the transition rules so a new status
|
|
||||||
# cannot be added without an explicit durable event shape.
|
|
||||||
if target == RunStatus.running:
|
|
||||||
return RunStarted(run_id=self.run_id, thread_id=self.thread_id, occurred_at=occurred_at)
|
|
||||||
if target == RunStatus.success:
|
|
||||||
return RunCompleted(run_id=self.run_id, thread_id=self.thread_id, occurred_at=occurred_at)
|
|
||||||
if target in (RunStatus.error, RunStatus.timeout):
|
|
||||||
return RunFailed(
|
|
||||||
run_id=self.run_id,
|
|
||||||
thread_id=self.thread_id,
|
|
||||||
status=target,
|
|
||||||
occurred_at=occurred_at,
|
|
||||||
error=error,
|
|
||||||
)
|
|
||||||
if target == RunStatus.interrupted:
|
|
||||||
return RunCancelled(
|
|
||||||
run_id=self.run_id,
|
|
||||||
thread_id=self.thread_id,
|
|
||||||
occurred_at=occurred_at,
|
|
||||||
action=action,
|
|
||||||
)
|
|
||||||
raise InvalidRunTransition(self.status, target)
|
|
||||||
|
|
||||||
def _record_event(self, event: RunEvent) -> None:
|
|
||||||
self._pending_events.append(event)
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"Run",
|
|
||||||
"RunStatus",
|
|
||||||
]
|
|
||||||
@@ -1,50 +0,0 @@
|
|||||||
"""Domain policies for run concurrency and cancellation."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from collections.abc import Sequence
|
|
||||||
from dataclasses import dataclass
|
|
||||||
from enum import StrEnum
|
|
||||||
|
|
||||||
from .model import Run
|
|
||||||
from .value_objects import CancelAction, MultitaskStrategy, RunStatus
|
|
||||||
|
|
||||||
|
|
||||||
class MultitaskDecision(StrEnum):
|
|
||||||
"""Application-level decision produced by a multitask policy."""
|
|
||||||
|
|
||||||
allow = "allow"
|
|
||||||
reject = "reject"
|
|
||||||
cancel_existing = "cancel_existing"
|
|
||||||
enqueue = "enqueue"
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class MultitaskPolicy:
|
|
||||||
strategy: MultitaskStrategy = MultitaskStrategy.reject
|
|
||||||
|
|
||||||
def decide(self, active_runs: Sequence[Run]) -> MultitaskDecision:
|
|
||||||
inflight = [run for run in active_runs if run.status in (RunStatus.pending, RunStatus.running)]
|
|
||||||
if not inflight:
|
|
||||||
return MultitaskDecision.allow
|
|
||||||
if self.strategy == MultitaskStrategy.reject:
|
|
||||||
return MultitaskDecision.reject
|
|
||||||
if self.strategy in (MultitaskStrategy.interrupt, MultitaskStrategy.rollback):
|
|
||||||
return MultitaskDecision.cancel_existing
|
|
||||||
return MultitaskDecision.enqueue
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class CancelPolicy:
|
|
||||||
action: CancelAction = CancelAction.interrupt
|
|
||||||
|
|
||||||
@property
|
|
||||||
def rolls_back_checkpoint(self) -> bool:
|
|
||||||
return self.action == CancelAction.rollback
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"CancelPolicy",
|
|
||||||
"MultitaskDecision",
|
|
||||||
"MultitaskPolicy",
|
|
||||||
]
|
|
||||||
@@ -1,88 +0,0 @@
|
|||||||
"""Domain value objects for run lifecycle semantics."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from dataclasses import dataclass
|
|
||||||
from enum import StrEnum
|
|
||||||
|
|
||||||
|
|
||||||
class RunStatus(StrEnum):
|
|
||||||
"""Lifecycle status of a single run."""
|
|
||||||
|
|
||||||
pending = "pending"
|
|
||||||
running = "running"
|
|
||||||
success = "success"
|
|
||||||
error = "error"
|
|
||||||
timeout = "timeout"
|
|
||||||
interrupted = "interrupted"
|
|
||||||
|
|
||||||
|
|
||||||
class DisconnectMode(StrEnum):
|
|
||||||
"""Behaviour when the SSE consumer disconnects."""
|
|
||||||
|
|
||||||
cancel = "cancel"
|
|
||||||
continue_ = "continue"
|
|
||||||
|
|
||||||
|
|
||||||
class RunScope(StrEnum):
|
|
||||||
"""Conversation scope for a run."""
|
|
||||||
|
|
||||||
stateful = "stateful"
|
|
||||||
stateless = "stateless"
|
|
||||||
temporary_thread = "temporary_thread"
|
|
||||||
|
|
||||||
|
|
||||||
class MultitaskStrategy(StrEnum):
|
|
||||||
"""Concurrency strategy for a new run on a thread."""
|
|
||||||
|
|
||||||
reject = "reject"
|
|
||||||
interrupt = "interrupt"
|
|
||||||
rollback = "rollback"
|
|
||||||
enqueue = "enqueue"
|
|
||||||
|
|
||||||
|
|
||||||
class CancelAction(StrEnum):
|
|
||||||
"""Cancellation action requested by an API or supervisor."""
|
|
||||||
|
|
||||||
interrupt = "interrupt"
|
|
||||||
rollback = "rollback"
|
|
||||||
|
|
||||||
|
|
||||||
TERMINAL_RUN_STATUSES: frozenset[RunStatus] = frozenset(
|
|
||||||
{
|
|
||||||
RunStatus.success,
|
|
||||||
RunStatus.error,
|
|
||||||
RunStatus.timeout,
|
|
||||||
RunStatus.interrupted,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def is_terminal_status(status: RunStatus) -> bool:
|
|
||||||
return status in TERMINAL_RUN_STATUSES
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True, order=True)
|
|
||||||
class EventSeq:
|
|
||||||
"""Thread-local event sequence number."""
|
|
||||||
|
|
||||||
value: int
|
|
||||||
|
|
||||||
def __post_init__(self) -> None:
|
|
||||||
if self.value < 0:
|
|
||||||
raise ValueError("EventSeq must be non-negative")
|
|
||||||
|
|
||||||
def next(self) -> EventSeq:
|
|
||||||
return EventSeq(self.value + 1)
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"CancelAction",
|
|
||||||
"DisconnectMode",
|
|
||||||
"EventSeq",
|
|
||||||
"MultitaskStrategy",
|
|
||||||
"RunScope",
|
|
||||||
"RunStatus",
|
|
||||||
"TERMINAL_RUN_STATUSES",
|
|
||||||
"is_terminal_status",
|
|
||||||
]
|
|
||||||
@@ -1,12 +0,0 @@
|
|||||||
"""Execution contracts for run lifecycle orchestration."""
|
|
||||||
|
|
||||||
from .executor import RunExecutor
|
|
||||||
from .scheduler import RunExecutionHandle, RunExecutionScheduler
|
|
||||||
from .supervisor import RunSupervisor
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"RunExecutionHandle",
|
|
||||||
"RunExecutionScheduler",
|
|
||||||
"RunExecutor",
|
|
||||||
"RunSupervisor",
|
|
||||||
]
|
|
||||||
@@ -1,19 +0,0 @@
|
|||||||
"""Run executor contract."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from typing import Protocol
|
|
||||||
|
|
||||||
from ..domain import Run
|
|
||||||
|
|
||||||
|
|
||||||
class RunExecutor(Protocol):
|
|
||||||
"""Executes one run against the underlying agent or graph runtime."""
|
|
||||||
|
|
||||||
async def execute(self, run: Run) -> None:
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"RunExecutor",
|
|
||||||
]
|
|
||||||
@@ -1,26 +0,0 @@
|
|||||||
"""Run execution scheduler contract."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from dataclasses import dataclass
|
|
||||||
from typing import Protocol
|
|
||||||
|
|
||||||
from ..domain import RunId
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class RunExecutionHandle:
|
|
||||||
run_id: RunId
|
|
||||||
|
|
||||||
|
|
||||||
class RunExecutionScheduler(Protocol):
|
|
||||||
"""Starts background execution for an accepted run."""
|
|
||||||
|
|
||||||
async def start(self, run_id: RunId) -> RunExecutionHandle:
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"RunExecutionHandle",
|
|
||||||
"RunExecutionScheduler",
|
|
||||||
]
|
|
||||||
@@ -1,19 +0,0 @@
|
|||||||
"""Run execution supervision contract."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from typing import Protocol
|
|
||||||
|
|
||||||
from ..domain import CancelAction, RunId
|
|
||||||
|
|
||||||
|
|
||||||
class RunSupervisor(Protocol):
|
|
||||||
"""Controls lifecycle operations for already scheduled runs."""
|
|
||||||
|
|
||||||
async def cancel(self, run_id: RunId, *, action: CancelAction = CancelAction.interrupt) -> bool:
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"RunSupervisor",
|
|
||||||
]
|
|
||||||
@@ -4,11 +4,9 @@ from __future__ import annotations
|
|||||||
|
|
||||||
import asyncio
|
import asyncio
|
||||||
import logging
|
import logging
|
||||||
import sqlite3
|
|
||||||
import uuid
|
import uuid
|
||||||
from collections.abc import Awaitable, Callable
|
|
||||||
from dataclasses import dataclass, field
|
from dataclasses import dataclass, field
|
||||||
from typing import TYPE_CHECKING, Any
|
from typing import TYPE_CHECKING
|
||||||
|
|
||||||
from deerflow.utils.time import now_iso as _now_iso
|
from deerflow.utils.time import now_iso as _now_iso
|
||||||
|
|
||||||
@@ -19,57 +17,6 @@ if TYPE_CHECKING:
|
|||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
_RETRYABLE_SQLITE_MESSAGES = (
|
|
||||||
"database is locked",
|
|
||||||
"database table is locked",
|
|
||||||
"database is busy",
|
|
||||||
)
|
|
||||||
|
|
||||||
_RETRYABLE_SQLITE_ERROR_CODES = {
|
|
||||||
sqlite3.SQLITE_BUSY,
|
|
||||||
sqlite3.SQLITE_LOCKED,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def _is_retryable_persistence_error(exc: BaseException) -> bool:
|
|
||||||
"""Return True for transient SQLite persistence failures.
|
|
||||||
|
|
||||||
SQLite lock contention normally surfaces through either sqlite3 exceptions
|
|
||||||
or SQLAlchemy wrappers. The short bounded retry here protects run status
|
|
||||||
finalization from transient writer pressure without hiding permanent
|
|
||||||
failures forever.
|
|
||||||
"""
|
|
||||||
|
|
||||||
pending: list[BaseException] = [exc]
|
|
||||||
seen: set[int] = set()
|
|
||||||
while pending:
|
|
||||||
current = pending.pop()
|
|
||||||
if id(current) in seen:
|
|
||||||
continue
|
|
||||||
seen.add(id(current))
|
|
||||||
|
|
||||||
message = str(current).lower()
|
|
||||||
if any(fragment in message for fragment in _RETRYABLE_SQLITE_MESSAGES):
|
|
||||||
return True
|
|
||||||
if isinstance(current, (sqlite3.OperationalError, sqlite3.DatabaseError)):
|
|
||||||
error_code = getattr(current, "sqlite_errorcode", None)
|
|
||||||
if error_code in _RETRYABLE_SQLITE_ERROR_CODES:
|
|
||||||
return True
|
|
||||||
for chained in (getattr(current, "orig", None), current.__cause__, current.__context__):
|
|
||||||
if isinstance(chained, BaseException):
|
|
||||||
pending.append(chained)
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class PersistenceRetryPolicy:
|
|
||||||
"""Bounded retry policy for short run-store writes."""
|
|
||||||
|
|
||||||
max_attempts: int = 5
|
|
||||||
initial_delay: float = 0.05
|
|
||||||
max_delay: float = 1.0
|
|
||||||
backoff_factor: float = 2.0
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
class RunRecord:
|
class RunRecord:
|
||||||
@@ -90,17 +37,6 @@ class RunRecord:
|
|||||||
abort_action: str = "interrupt"
|
abort_action: str = "interrupt"
|
||||||
error: str | None = None
|
error: str | None = None
|
||||||
model_name: str | None = None
|
model_name: str | None = None
|
||||||
store_only: bool = False
|
|
||||||
total_input_tokens: int = 0
|
|
||||||
total_output_tokens: int = 0
|
|
||||||
total_tokens: int = 0
|
|
||||||
llm_call_count: int = 0
|
|
||||||
lead_agent_tokens: int = 0
|
|
||||||
subagent_tokens: int = 0
|
|
||||||
middleware_tokens: int = 0
|
|
||||||
message_count: int = 0
|
|
||||||
last_ai_message: str | None = None
|
|
||||||
first_human_message: str | None = None
|
|
||||||
|
|
||||||
|
|
||||||
class RunManager:
|
class RunManager:
|
||||||
@@ -111,205 +47,37 @@ class RunManager:
|
|||||||
that run history survives process restarts.
|
that run history survives process restarts.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
def __init__(
|
def __init__(self, store: RunStore | None = None) -> None:
|
||||||
self,
|
|
||||||
store: RunStore | None = None,
|
|
||||||
*,
|
|
||||||
persistence_retry_policy: PersistenceRetryPolicy | None = None,
|
|
||||||
) -> None:
|
|
||||||
self._runs: dict[str, RunRecord] = {}
|
self._runs: dict[str, RunRecord] = {}
|
||||||
self._lock = asyncio.Lock()
|
self._lock = asyncio.Lock()
|
||||||
self._store = store
|
self._store = store
|
||||||
self._persistence_retry_policy = persistence_retry_policy or PersistenceRetryPolicy()
|
|
||||||
|
|
||||||
@staticmethod
|
async def _persist_to_store(self, record: RunRecord) -> None:
|
||||||
def _store_put_payload(record: RunRecord, *, error: str | None = None) -> dict[str, Any]:
|
"""Best-effort persist run record to backing store."""
|
||||||
return {
|
|
||||||
"thread_id": record.thread_id,
|
|
||||||
"assistant_id": record.assistant_id,
|
|
||||||
"status": record.status.value,
|
|
||||||
"multitask_strategy": record.multitask_strategy,
|
|
||||||
"metadata": record.metadata or {},
|
|
||||||
"kwargs": record.kwargs or {},
|
|
||||||
"error": error if error is not None else record.error,
|
|
||||||
"created_at": record.created_at,
|
|
||||||
"model_name": record.model_name,
|
|
||||||
}
|
|
||||||
|
|
||||||
async def _call_store_with_retry(
|
|
||||||
self,
|
|
||||||
operation_name: str,
|
|
||||||
run_id: str,
|
|
||||||
operation: Callable[[], Awaitable[Any]],
|
|
||||||
) -> Any:
|
|
||||||
"""Run a short store operation with bounded retries for SQLite pressure."""
|
|
||||||
policy = self._persistence_retry_policy
|
|
||||||
attempt = 1
|
|
||||||
delay = policy.initial_delay
|
|
||||||
while True:
|
|
||||||
try:
|
|
||||||
return await operation()
|
|
||||||
except Exception as exc:
|
|
||||||
retryable = _is_retryable_persistence_error(exc)
|
|
||||||
if attempt >= policy.max_attempts or not retryable:
|
|
||||||
raise
|
|
||||||
logger.warning(
|
|
||||||
"Transient persistence failure during %s for run %s (attempt %d/%d); retrying",
|
|
||||||
operation_name,
|
|
||||||
run_id,
|
|
||||||
attempt,
|
|
||||||
policy.max_attempts,
|
|
||||||
exc_info=True,
|
|
||||||
)
|
|
||||||
if delay > 0:
|
|
||||||
await asyncio.sleep(delay)
|
|
||||||
delay = min(policy.max_delay, delay * policy.backoff_factor if delay else policy.initial_delay)
|
|
||||||
attempt += 1
|
|
||||||
|
|
||||||
async def _persist_snapshot_to_store(self, run_id: str, payload: dict[str, Any]) -> bool:
|
|
||||||
"""Best-effort persist a previously captured run snapshot."""
|
|
||||||
if self._store is None:
|
|
||||||
return True
|
|
||||||
try:
|
|
||||||
await self._call_store_with_retry(
|
|
||||||
"put",
|
|
||||||
run_id,
|
|
||||||
lambda: self._store.put(run_id, **payload),
|
|
||||||
)
|
|
||||||
return True
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to persist run %s to store", run_id, exc_info=True)
|
|
||||||
return False
|
|
||||||
|
|
||||||
async def _persist_new_run_to_store(self, record: RunRecord) -> None:
|
|
||||||
"""Persist a newly created run record to the backing store.
|
|
||||||
|
|
||||||
Initial run creation is part of the run visibility boundary: callers
|
|
||||||
should not observe a run in memory unless its backing store row exists.
|
|
||||||
Unlike follow-up status/model updates, failures are propagated so the
|
|
||||||
caller can treat creation as failed. Rollback is the caller's
|
|
||||||
responsibility after inserting the record into ``_runs``.
|
|
||||||
"""
|
|
||||||
if self._store is None:
|
if self._store is None:
|
||||||
return
|
return
|
||||||
await self._call_store_with_retry(
|
|
||||||
"put",
|
|
||||||
record.run_id,
|
|
||||||
lambda: self._store.put(record.run_id, **self._store_put_payload(record)),
|
|
||||||
)
|
|
||||||
|
|
||||||
async def _persist_to_store(self, record: RunRecord, *, error: str | None = None) -> bool:
|
|
||||||
"""Best-effort persist run record to backing store."""
|
|
||||||
return await self._persist_snapshot_to_store(
|
|
||||||
record.run_id,
|
|
||||||
self._store_put_payload(record, error=error),
|
|
||||||
)
|
|
||||||
|
|
||||||
async def _persist_status(self, record: RunRecord, status: RunStatus, *, error: str | None = None) -> bool:
|
|
||||||
"""Best-effort persist a status transition to the backing store."""
|
|
||||||
if self._store is None:
|
|
||||||
return True
|
|
||||||
row_recovery_payload = self._store_put_payload(record, error=error)
|
|
||||||
try:
|
try:
|
||||||
updated = await self._call_store_with_retry(
|
await self._store.put(
|
||||||
"update_status",
|
|
||||||
record.run_id,
|
record.run_id,
|
||||||
lambda: self._store.update_status(record.run_id, status.value, error=error),
|
thread_id=record.thread_id,
|
||||||
|
assistant_id=record.assistant_id,
|
||||||
|
status=record.status.value,
|
||||||
|
multitask_strategy=record.multitask_strategy,
|
||||||
|
metadata=record.metadata or {},
|
||||||
|
kwargs=record.kwargs or {},
|
||||||
|
created_at=record.created_at,
|
||||||
|
model_name=record.model_name,
|
||||||
)
|
)
|
||||||
if updated is False:
|
|
||||||
return await self._persist_snapshot_to_store(record.run_id, row_recovery_payload)
|
|
||||||
return True
|
|
||||||
except Exception:
|
except Exception:
|
||||||
logger.warning("Failed to persist status update for run %s", record.run_id, exc_info=True)
|
logger.warning("Failed to persist run %s to store", record.run_id, exc_info=True)
|
||||||
return False
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _record_from_store(row: dict[str, Any]) -> RunRecord:
|
|
||||||
"""Build a read-only runtime record from a serialized store row.
|
|
||||||
|
|
||||||
NULL status/on_disconnect columns (e.g. from rows written before those
|
|
||||||
columns were added) default to ``pending`` and ``cancel`` respectively.
|
|
||||||
"""
|
|
||||||
return RunRecord(
|
|
||||||
run_id=row["run_id"],
|
|
||||||
thread_id=row["thread_id"],
|
|
||||||
assistant_id=row.get("assistant_id"),
|
|
||||||
status=RunStatus(row.get("status") or RunStatus.pending.value),
|
|
||||||
on_disconnect=DisconnectMode(row.get("on_disconnect") or DisconnectMode.cancel.value),
|
|
||||||
multitask_strategy=row.get("multitask_strategy") or "reject",
|
|
||||||
metadata=row.get("metadata") or {},
|
|
||||||
kwargs=row.get("kwargs") or {},
|
|
||||||
created_at=row.get("created_at") or "",
|
|
||||||
updated_at=row.get("updated_at") or "",
|
|
||||||
error=row.get("error"),
|
|
||||||
model_name=row.get("model_name"),
|
|
||||||
store_only=True,
|
|
||||||
total_input_tokens=row.get("total_input_tokens") or 0,
|
|
||||||
total_output_tokens=row.get("total_output_tokens") or 0,
|
|
||||||
total_tokens=row.get("total_tokens") or 0,
|
|
||||||
llm_call_count=row.get("llm_call_count") or 0,
|
|
||||||
lead_agent_tokens=row.get("lead_agent_tokens") or 0,
|
|
||||||
subagent_tokens=row.get("subagent_tokens") or 0,
|
|
||||||
middleware_tokens=row.get("middleware_tokens") or 0,
|
|
||||||
message_count=row.get("message_count") or 0,
|
|
||||||
last_ai_message=row.get("last_ai_message"),
|
|
||||||
first_human_message=row.get("first_human_message"),
|
|
||||||
)
|
|
||||||
|
|
||||||
async def update_run_completion(self, run_id: str, **kwargs) -> None:
|
async def update_run_completion(self, run_id: str, **kwargs) -> None:
|
||||||
"""Persist token usage and completion data to the backing store."""
|
"""Persist token usage and completion data to the backing store."""
|
||||||
row_recovery_payload: dict[str, Any] | None = None
|
if self._store is not None:
|
||||||
async with self._lock:
|
|
||||||
record = self._runs.get(run_id)
|
|
||||||
if record is not None:
|
|
||||||
for key, value in kwargs.items():
|
|
||||||
if key == "status":
|
|
||||||
continue
|
|
||||||
if hasattr(record, key) and value is not None:
|
|
||||||
setattr(record, key, value)
|
|
||||||
record.updated_at = _now_iso()
|
|
||||||
row_recovery_payload = self._store_put_payload(record, error=kwargs.get("error"))
|
|
||||||
if self._store is None:
|
|
||||||
return
|
|
||||||
try:
|
|
||||||
updated = await self._call_store_with_retry(
|
|
||||||
"update_run_completion",
|
|
||||||
run_id,
|
|
||||||
lambda: self._store.update_run_completion(run_id, **kwargs),
|
|
||||||
)
|
|
||||||
if updated is False:
|
|
||||||
if row_recovery_payload is None:
|
|
||||||
logger.warning("Failed to recreate missing run %s for completion persistence", run_id)
|
|
||||||
return
|
|
||||||
if not await self._persist_snapshot_to_store(run_id, row_recovery_payload):
|
|
||||||
return
|
|
||||||
recovered = await self._call_store_with_retry(
|
|
||||||
"update_run_completion",
|
|
||||||
run_id,
|
|
||||||
lambda: self._store.update_run_completion(run_id, **kwargs),
|
|
||||||
)
|
|
||||||
if recovered is False:
|
|
||||||
logger.warning("Run completion update for %s affected no rows after row recreation", run_id)
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to persist run completion for %s", run_id, exc_info=True)
|
|
||||||
|
|
||||||
async def update_run_progress(self, run_id: str, **kwargs) -> None:
|
|
||||||
"""Persist a running token/message snapshot without changing status."""
|
|
||||||
should_persist = True
|
|
||||||
async with self._lock:
|
|
||||||
record = self._runs.get(run_id)
|
|
||||||
if record is not None:
|
|
||||||
should_persist = record.status == RunStatus.running
|
|
||||||
if record is not None and should_persist:
|
|
||||||
for key, value in kwargs.items():
|
|
||||||
if hasattr(record, key) and value is not None:
|
|
||||||
setattr(record, key, value)
|
|
||||||
record.updated_at = _now_iso()
|
|
||||||
if should_persist and self._store is not None:
|
|
||||||
try:
|
try:
|
||||||
await self._store.update_run_progress(run_id, **kwargs)
|
await self._store.update_run_completion(run_id, **kwargs)
|
||||||
except Exception:
|
except Exception:
|
||||||
logger.warning("Failed to persist run progress for %s", run_id, exc_info=True)
|
logger.warning("Failed to persist run completion for %s", run_id, exc_info=True)
|
||||||
|
|
||||||
async def create(
|
async def create(
|
||||||
self,
|
self,
|
||||||
@@ -338,91 +106,20 @@ class RunManager:
|
|||||||
)
|
)
|
||||||
async with self._lock:
|
async with self._lock:
|
||||||
self._runs[run_id] = record
|
self._runs[run_id] = record
|
||||||
persisted = False
|
await self._persist_to_store(record)
|
||||||
try:
|
|
||||||
await self._persist_new_run_to_store(record)
|
|
||||||
persisted = True
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to persist run %s; rolled back in-memory record", run_id, exc_info=True)
|
|
||||||
raise
|
|
||||||
finally:
|
|
||||||
# Also covers cancellation, which bypasses ``except Exception``.
|
|
||||||
if not persisted:
|
|
||||||
self._runs.pop(run_id, None)
|
|
||||||
logger.info("Run created: run_id=%s thread_id=%s", run_id, thread_id)
|
logger.info("Run created: run_id=%s thread_id=%s", run_id, thread_id)
|
||||||
return record
|
return record
|
||||||
|
|
||||||
async def get(self, run_id: str, *, user_id: str | None = None) -> RunRecord | None:
|
def get(self, run_id: str) -> RunRecord | None:
|
||||||
"""Return a run record by ID, or ``None``.
|
"""Return a run record by ID, or ``None``."""
|
||||||
|
return self._runs.get(run_id)
|
||||||
|
|
||||||
Args:
|
async def list_by_thread(self, thread_id: str) -> list[RunRecord]:
|
||||||
run_id: The run ID to look up.
|
"""Return all runs for a given thread, newest first."""
|
||||||
user_id: Optional user ID for permission filtering when hydrating from store.
|
|
||||||
"""
|
|
||||||
async with self._lock:
|
async with self._lock:
|
||||||
record = self._runs.get(run_id)
|
# Dict insertion order matches creation order, so reversing it gives
|
||||||
if record is not None:
|
# us deterministic newest-first results even when timestamps tie.
|
||||||
return record
|
return [r for r in self._runs.values() if r.thread_id == thread_id]
|
||||||
if self._store is None:
|
|
||||||
return None
|
|
||||||
try:
|
|
||||||
row = await self._store.get(run_id, user_id=user_id)
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to hydrate run %s from store", run_id, exc_info=True)
|
|
||||||
return None
|
|
||||||
# Re-check after store await: a concurrent create() may have inserted the
|
|
||||||
# in-memory record while the store call was in flight.
|
|
||||||
async with self._lock:
|
|
||||||
record = self._runs.get(run_id)
|
|
||||||
if record is not None:
|
|
||||||
return record
|
|
||||||
if row is None:
|
|
||||||
return None
|
|
||||||
try:
|
|
||||||
return self._record_from_store(row)
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to map store row for run %s", run_id, exc_info=True)
|
|
||||||
return None
|
|
||||||
|
|
||||||
async def aget(self, run_id: str, *, user_id: str | None = None) -> RunRecord | None:
|
|
||||||
"""Return a run record by ID, checking the persistent store as fallback.
|
|
||||||
|
|
||||||
Alias for :meth:`get` for backward compatibility.
|
|
||||||
"""
|
|
||||||
return await self.get(run_id, user_id=user_id)
|
|
||||||
|
|
||||||
async def list_by_thread(self, thread_id: str, *, user_id: str | None = None, limit: int = 100) -> list[RunRecord]:
|
|
||||||
"""Return runs for a given thread, newest first, at most ``limit`` records.
|
|
||||||
|
|
||||||
In-memory runs take precedence only when the same ``run_id`` exists in both
|
|
||||||
memory and the backing store. The merged result is then sorted newest-first
|
|
||||||
by ``created_at`` and trimmed to ``limit`` (default 100).
|
|
||||||
|
|
||||||
Args:
|
|
||||||
thread_id: The thread ID to filter by.
|
|
||||||
user_id: Optional user ID for permission filtering when hydrating from store.
|
|
||||||
limit: Maximum number of runs to return.
|
|
||||||
"""
|
|
||||||
async with self._lock:
|
|
||||||
# Dict insertion order gives deterministic results when timestamps tie.
|
|
||||||
memory_records = [r for r in self._runs.values() if r.thread_id == thread_id]
|
|
||||||
if self._store is None:
|
|
||||||
return sorted(memory_records, key=lambda r: r.created_at, reverse=True)[:limit]
|
|
||||||
records_by_id = {record.run_id: record for record in memory_records}
|
|
||||||
store_limit = max(0, limit - len(memory_records))
|
|
||||||
try:
|
|
||||||
rows = await self._store.list_by_thread(thread_id, user_id=user_id, limit=store_limit)
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to hydrate runs for thread %s from store", thread_id, exc_info=True)
|
|
||||||
return sorted(memory_records, key=lambda r: r.created_at, reverse=True)[:limit]
|
|
||||||
for row in rows:
|
|
||||||
run_id = row.get("run_id")
|
|
||||||
if run_id and run_id not in records_by_id:
|
|
||||||
try:
|
|
||||||
records_by_id[run_id] = self._record_from_store(row)
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to map store row for run %s", run_id, exc_info=True)
|
|
||||||
return sorted(records_by_id.values(), key=lambda record: record.created_at, reverse=True)[:limit]
|
|
||||||
|
|
||||||
async def set_status(self, run_id: str, status: RunStatus, *, error: str | None = None) -> None:
|
async def set_status(self, run_id: str, status: RunStatus, *, error: str | None = None) -> None:
|
||||||
"""Transition a run to a new status."""
|
"""Transition a run to a new status."""
|
||||||
@@ -435,22 +132,13 @@ class RunManager:
|
|||||||
record.updated_at = _now_iso()
|
record.updated_at = _now_iso()
|
||||||
if error is not None:
|
if error is not None:
|
||||||
record.error = error
|
record.error = error
|
||||||
await self._persist_status(record, status, error=error)
|
if self._store is not None:
|
||||||
|
try:
|
||||||
|
await self._store.update_status(run_id, status.value, error=error)
|
||||||
|
except Exception:
|
||||||
|
logger.warning("Failed to persist status update for run %s", run_id, exc_info=True)
|
||||||
logger.info("Run %s -> %s", run_id, status.value)
|
logger.info("Run %s -> %s", run_id, status.value)
|
||||||
|
|
||||||
async def _persist_model_name(self, run_id: str, model_name: str | None) -> None:
|
|
||||||
"""Best-effort persist model_name update to the backing store."""
|
|
||||||
if self._store is None:
|
|
||||||
return
|
|
||||||
try:
|
|
||||||
await self._call_store_with_retry(
|
|
||||||
"update_model_name",
|
|
||||||
run_id,
|
|
||||||
lambda: self._store.update_model_name(run_id, model_name),
|
|
||||||
)
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to persist model_name update for run %s", run_id, exc_info=True)
|
|
||||||
|
|
||||||
async def update_model_name(self, run_id: str, model_name: str | None) -> None:
|
async def update_model_name(self, run_id: str, model_name: str | None) -> None:
|
||||||
"""Update the model name for a run."""
|
"""Update the model name for a run."""
|
||||||
async with self._lock:
|
async with self._lock:
|
||||||
@@ -460,7 +148,7 @@ class RunManager:
|
|||||||
return
|
return
|
||||||
record.model_name = model_name
|
record.model_name = model_name
|
||||||
record.updated_at = _now_iso()
|
record.updated_at = _now_iso()
|
||||||
await self._persist_model_name(run_id, model_name)
|
await self._persist_to_store(record)
|
||||||
logger.info("Run %s model_name=%s", run_id, model_name)
|
logger.info("Run %s model_name=%s", run_id, model_name)
|
||||||
|
|
||||||
async def cancel(self, run_id: str, *, action: str = "interrupt") -> bool:
|
async def cancel(self, run_id: str, *, action: str = "interrupt") -> bool:
|
||||||
@@ -471,17 +159,12 @@ class RunManager:
|
|||||||
action: "interrupt" keeps checkpoint, "rollback" reverts to pre-run state.
|
action: "interrupt" keeps checkpoint, "rollback" reverts to pre-run state.
|
||||||
|
|
||||||
Sets the abort event with the action reason and cancels the asyncio task.
|
Sets the abort event with the action reason and cancels the asyncio task.
|
||||||
Returns ``True`` if cancellation was initiated **or** the run was already
|
Returns ``True`` if the run was in-flight and cancellation was initiated.
|
||||||
interrupted (idempotent — a second cancel is a no-op success).
|
|
||||||
Returns ``False`` only when the run is unknown to this worker or has
|
|
||||||
reached a terminal state other than interrupted (completed, failed, etc.).
|
|
||||||
"""
|
"""
|
||||||
async with self._lock:
|
async with self._lock:
|
||||||
record = self._runs.get(run_id)
|
record = self._runs.get(run_id)
|
||||||
if record is None:
|
if record is None:
|
||||||
return False
|
return False
|
||||||
if record.status == RunStatus.interrupted:
|
|
||||||
return True # idempotent — already cancelled on this worker
|
|
||||||
if record.status not in (RunStatus.pending, RunStatus.running):
|
if record.status not in (RunStatus.pending, RunStatus.running):
|
||||||
return False
|
return False
|
||||||
record.abort_action = action
|
record.abort_action = action
|
||||||
@@ -490,7 +173,6 @@ class RunManager:
|
|||||||
record.task.cancel()
|
record.task.cancel()
|
||||||
record.status = RunStatus.interrupted
|
record.status = RunStatus.interrupted
|
||||||
record.updated_at = _now_iso()
|
record.updated_at = _now_iso()
|
||||||
await self._persist_status(record, RunStatus.interrupted)
|
|
||||||
logger.info("Run %s cancelled (action=%s)", run_id, action)
|
logger.info("Run %s cancelled (action=%s)", run_id, action)
|
||||||
return True
|
return True
|
||||||
|
|
||||||
@@ -518,7 +200,6 @@ class RunManager:
|
|||||||
now = _now_iso()
|
now = _now_iso()
|
||||||
|
|
||||||
_supported_strategies = ("reject", "interrupt", "rollback")
|
_supported_strategies = ("reject", "interrupt", "rollback")
|
||||||
interrupted_records: list[RunRecord] = []
|
|
||||||
|
|
||||||
async with self._lock:
|
async with self._lock:
|
||||||
if multitask_strategy not in _supported_strategies:
|
if multitask_strategy not in _supported_strategies:
|
||||||
@@ -530,8 +211,15 @@ class RunManager:
|
|||||||
raise ConflictError(f"Thread {thread_id} already has an active run")
|
raise ConflictError(f"Thread {thread_id} already has an active run")
|
||||||
|
|
||||||
if multitask_strategy in ("interrupt", "rollback") and inflight:
|
if multitask_strategy in ("interrupt", "rollback") and inflight:
|
||||||
|
for r in inflight:
|
||||||
|
r.abort_action = multitask_strategy
|
||||||
|
r.abort_event.set()
|
||||||
|
if r.task is not None and not r.task.done():
|
||||||
|
r.task.cancel()
|
||||||
|
r.status = RunStatus.interrupted
|
||||||
|
r.updated_at = now
|
||||||
logger.info(
|
logger.info(
|
||||||
"Preparing to cancel %d inflight run(s) on thread %s (strategy=%s)",
|
"Cancelled %d inflight run(s) on thread %s (strategy=%s)",
|
||||||
len(inflight),
|
len(inflight),
|
||||||
thread_id,
|
thread_id,
|
||||||
multitask_strategy,
|
multitask_strategy,
|
||||||
@@ -551,87 +239,11 @@ class RunManager:
|
|||||||
model_name=model_name,
|
model_name=model_name,
|
||||||
)
|
)
|
||||||
self._runs[run_id] = record
|
self._runs[run_id] = record
|
||||||
persisted = False
|
|
||||||
try:
|
|
||||||
await self._persist_new_run_to_store(record)
|
|
||||||
persisted = True
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to persist run %s; rolled back in-memory record", run_id, exc_info=True)
|
|
||||||
raise
|
|
||||||
finally:
|
|
||||||
# Also covers cancellation, which bypasses ``except Exception``.
|
|
||||||
if not persisted:
|
|
||||||
self._runs.pop(run_id, None)
|
|
||||||
|
|
||||||
if multitask_strategy in ("interrupt", "rollback") and inflight:
|
await self._persist_to_store(record)
|
||||||
for r in inflight:
|
|
||||||
r.abort_action = multitask_strategy
|
|
||||||
r.abort_event.set()
|
|
||||||
if r.task is not None and not r.task.done():
|
|
||||||
r.task.cancel()
|
|
||||||
r.status = RunStatus.interrupted
|
|
||||||
r.updated_at = now
|
|
||||||
interrupted_records.append(r)
|
|
||||||
|
|
||||||
for interrupted_record in interrupted_records:
|
|
||||||
await self._persist_status(interrupted_record, RunStatus.interrupted)
|
|
||||||
logger.info("Run created: run_id=%s thread_id=%s", run_id, thread_id)
|
logger.info("Run created: run_id=%s thread_id=%s", run_id, thread_id)
|
||||||
return record
|
return record
|
||||||
|
|
||||||
async def reconcile_orphaned_inflight_runs(
|
|
||||||
self,
|
|
||||||
*,
|
|
||||||
error: str,
|
|
||||||
before: str | None = None,
|
|
||||||
) -> list[RunRecord]:
|
|
||||||
"""Mark persisted active runs as failed when no local task owns them.
|
|
||||||
|
|
||||||
Gateway runs are process-local: the asyncio task and abort event live in
|
|
||||||
memory, while the run row is durable. After a SQLite-backed gateway
|
|
||||||
restart, any persisted ``pending`` or ``running`` row created before
|
|
||||||
startup cannot still have a local worker. This recovery step turns that
|
|
||||||
ambiguous state into an explicit error instead of letting the UI show an
|
|
||||||
indefinite active run.
|
|
||||||
"""
|
|
||||||
if self._store is None:
|
|
||||||
return []
|
|
||||||
try:
|
|
||||||
rows = await self._call_store_with_retry(
|
|
||||||
"list_inflight",
|
|
||||||
"*",
|
|
||||||
lambda: self._store.list_inflight(before=before),
|
|
||||||
)
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to list orphaned inflight runs for reconciliation", exc_info=True)
|
|
||||||
return []
|
|
||||||
|
|
||||||
recovered: list[RunRecord] = []
|
|
||||||
now = _now_iso()
|
|
||||||
for row in rows:
|
|
||||||
try:
|
|
||||||
record = self._record_from_store(row)
|
|
||||||
except Exception:
|
|
||||||
logger.warning("Failed to map orphaned run row during reconciliation", exc_info=True)
|
|
||||||
continue
|
|
||||||
|
|
||||||
async with self._lock:
|
|
||||||
live_record = self._runs.get(record.run_id)
|
|
||||||
if live_record is not None and live_record.status in (RunStatus.pending, RunStatus.running):
|
|
||||||
continue
|
|
||||||
|
|
||||||
record.status = RunStatus.error
|
|
||||||
record.error = error
|
|
||||||
record.updated_at = now
|
|
||||||
persisted = await self._persist_status(record, RunStatus.error, error=error)
|
|
||||||
if not persisted:
|
|
||||||
logger.warning("Skipped orphaned run %s recovery because error status was not persisted", record.run_id)
|
|
||||||
continue
|
|
||||||
recovered.append(record)
|
|
||||||
|
|
||||||
if recovered:
|
|
||||||
logger.warning("Recovered %d orphaned inflight run(s) as error", len(recovered))
|
|
||||||
return recovered
|
|
||||||
|
|
||||||
async def has_inflight(self, thread_id: str) -> bool:
|
async def has_inflight(self, thread_id: str) -> bool:
|
||||||
"""Return ``True`` if *thread_id* has a pending or running run."""
|
"""Return ``True`` if *thread_id* has a pending or running run."""
|
||||||
async with self._lock:
|
async with self._lock:
|
||||||
|
|||||||
@@ -1,16 +0,0 @@
|
|||||||
"""Run naming helpers for LangChain/LangSmith tracing."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from collections.abc import Mapping
|
|
||||||
from typing import Any
|
|
||||||
|
|
||||||
|
|
||||||
def resolve_root_run_name(config: Mapping[str, Any], assistant_id: str | None) -> str:
|
|
||||||
for container_name in ("context", "configurable"):
|
|
||||||
container = config.get(container_name)
|
|
||||||
if isinstance(container, Mapping):
|
|
||||||
agent_name = container.get("agent_name")
|
|
||||||
if isinstance(agent_name, str) and agent_name.strip():
|
|
||||||
return agent_name
|
|
||||||
return assistant_id or "lead_agent"
|
|
||||||
@@ -1,9 +0,0 @@
|
|||||||
"""Repository contracts for the run runtime application layer."""
|
|
||||||
|
|
||||||
from .run_event_log import RunEventLog
|
|
||||||
from .run_repository import RunRepository
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"RunEventLog",
|
|
||||||
"RunRepository",
|
|
||||||
]
|
|
||||||
@@ -1,42 +0,0 @@
|
|||||||
"""Durable run event log contract."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from typing import TYPE_CHECKING, Protocol
|
|
||||||
|
|
||||||
from ..domain import RunEvent, RunId, ThreadId
|
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
|
||||||
from ..application.dto import RunMessageView, StoredRunEvent
|
|
||||||
|
|
||||||
|
|
||||||
class RunEventLog(Protocol):
|
|
||||||
"""Persistence boundary for run messages and execution trace events."""
|
|
||||||
|
|
||||||
async def append(self, events: list[RunEvent]) -> list[StoredRunEvent]:
|
|
||||||
pass
|
|
||||||
|
|
||||||
async def list_messages_by_run(
|
|
||||||
self,
|
|
||||||
thread_id: ThreadId,
|
|
||||||
run_id: RunId,
|
|
||||||
*,
|
|
||||||
limit: int = 50,
|
|
||||||
before_seq: int | None = None,
|
|
||||||
after_seq: int | None = None,
|
|
||||||
) -> list[RunMessageView]:
|
|
||||||
pass
|
|
||||||
|
|
||||||
async def list_events_by_run(
|
|
||||||
self,
|
|
||||||
thread_id: ThreadId,
|
|
||||||
run_id: RunId,
|
|
||||||
*,
|
|
||||||
limit: int = 500,
|
|
||||||
) -> list[StoredRunEvent]:
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
|
||||||
"RunEventLog",
|
|
||||||
]
|
|
||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user