Squashes 25 PR commits onto current main. AppConfig becomes a pure value object with no ambient lookup. Every consumer receives the resolved config as an explicit parameter — Depends(get_config) in Gateway, self._app_config in DeerFlowClient, runtime.context.app_config in agent runs, AppConfig.from_file() at the LangGraph Server registration boundary. Phase 1 — frozen data + typed context - All config models (AppConfig, MemoryConfig, DatabaseConfig, …) become frozen=True; no sub-module globals. - AppConfig.from_file() is pure (no side-effect singleton loaders). - Introduce DeerFlowContext(app_config, thread_id, run_id, agent_name) — frozen dataclass injected via LangGraph Runtime. - Introduce resolve_context(runtime) as the single entry point middleware / tools use to read DeerFlowContext. Phase 2 — pure explicit parameter passing - Gateway: app.state.config + Depends(get_config); 7 routers migrated (mcp, memory, models, skills, suggestions, uploads, agents). - DeerFlowClient: __init__(config=...) captures config locally. - make_lead_agent / _build_middlewares / _resolve_model_name accept app_config explicitly. - RunContext.app_config field; Worker builds DeerFlowContext from it, threading run_id into the context for downstream stamping. - Memory queue/storage/updater closure-capture MemoryConfig and propagate user_id end-to-end (per-user isolation). - Sandbox/skills/community/factories/tools thread app_config. - resolve_context() rejects non-typed runtime.context. - Test suite migrated off AppConfig.current() monkey-patches. - AppConfig.current() classmethod deleted. Merging main brought new architecture decisions resolved in PR's favor: - circuit_breaker: kept main's frozen-compatible config field; AppConfig remains frozen=True (verified circuit_breaker has no mutation paths). - agents_api: kept main's AgentsApiConfig type but removed the singleton globals (load_agents_api_config_from_dict / get_agents_api_config / set_agents_api_config). 8 routes in agents.py now read via Depends(get_config). - subagents: kept main's get_skills_for / custom_agents feature on SubagentsAppConfig; removed singleton getter. registry.py now reads app_config.subagents directly. - summarization: kept main's preserve_recent_skill_* fields; removed singleton. - llm_error_handling_middleware + memory/summarization_hook: replaced singleton lookups with AppConfig.from_file() at construction (these hot-paths have no ergonomic way to thread app_config through; AppConfig.from_file is a pure load). - worker.py + thread_data_middleware.py: DeerFlowContext.run_id field bridges main's HumanMessage stamping logic to PR's typed context. Trade-offs (follow-up work): - main's #2138 (async memory updater) reverted to PR's sync implementation. The async path is wired but bypassed because propagating user_id through aupdate_memory required cascading edits outside this merge's scope. - tests/test_subagent_skills_config.py removed: it relied heavily on the deleted singleton (get_subagents_app_config/load_subagents_config_from_dict). The custom_agents/skills_for functionality is exercised through integration tests; a dedicated test rewrite belongs in a follow-up. Verification: backend test suite — 2560 passed, 4 skipped, 84 failures. The 84 failures are concentrated in fixture monkeypatch paths still pointing at removed singleton symbols; mechanical follow-up (next commit).
12 KiB
Summarize Marker in History — Design & Verification
Date: 2026-04-11
Branch: rayhpeng/fix-persistence-new
Status: Design approved, implementation deferred to a follow-up PR
Depends on: 2026-04-11-runjournal-history-evaluation.md (the event-store-backed history fix this builds on)
1. Goal
Display a "summarization happened here" marker in the conversation history UI when SummarizationMiddleware ran mid-run, so users understand why earlier messages look condensed or missing. The event-store-backed /history fix already recovered the original messages; this spec adds a visible marker at the seq position where summarization occurred, optionally showing the generated summary text.
2. Investigation findings
2.1 Today's state: zero middleware records
Full scan of backend/.deer-flow/data/deerflow.db run_events:
| category | rows |
|---|---|
| trace | 76 |
| message | 34 |
| lifecycle | 8 |
| middleware | 0 |
No row has event_type containing summariz or middleware. The middleware category is dead in production.
2.2 Why: two dead code paths in journal.py
| Location | Status |
|---|---|
journal.py:343-362 — on_custom_event("summarization", ...) writes one trace event + one category="middleware" event. |
Dead. Only fires when something calls adispatch_custom_event("summarization", {...}). The upstream LangChain SummarizationMiddleware (.venv/.../langchain/agents/middleware/summarization.py:272) never emits custom events — its before_model/abefore_model just mutate messages in place and return {'messages': new_messages}. Callback never triggered. |
journal.py:449 — record_middleware(tag, *, name, hook, action, changes) helper |
Dead. Grep shows zero callers in the harness. Added speculatively, never wired up. |
2.3 Concrete evidence of summarize running unlogged
Thread 3d5dea4a-0983-4727-a4e8-41a64428933a:
run_eventsseq=1 → original human"写一份关于deer-flow的详细技术报告"✓ (event store is fine)run_eventsseq=43 →llm_requesttrace whosemessages[0]literal contains"Here is a summary of the conversation to date:"— proof that SummarizationMiddleware did inject a summary mid-run- Zero rows with
category='middleware'for this thread → nothing captured for UI to render
3. Approaches considered
A. Subclass SummarizationMiddleware and dispatch a custom event
Wrap the upstream class, override abefore_model, call await adispatch_custom_event("summarization", {...}) after super(). Journal's existing on_custom_event path captures it.
B. Frontend-only diff heuristic
Compare event_store.count_messages() vs rendered count, infer summarization happened from the gap. Rejected: can't pinpoint position in the stream, can't show summary text. Only yields a vague badge.
C. Hybrid A + frontend inline card rendered at the middleware event's seq position
Same backend as A, plus frontend renders an inline [N messages condensed] card at the correct chronological position. Recommended terminal state.
4. Subagent's wrong claim and its rebuttal
An independent agent flagged approach A as structurally broken because:
RunnableCallable(trace=False)skipsset_config_context, thereforevar_child_runnable_configis never set, thereforeadispatch_custom_eventraisesRuntimeError("Unable to dispatch an adhoc event without a parent run id").
This is wrong. The user's counter-intuition was correct: trace=False does not prevent adispatch_custom_event from working, as long as the middleware signature explicitly accepts config: RunnableConfig. The mechanism:
RunnableCallable.__init__(langgraph/_internal/_runnable.py:293-319) inspects the function signature. If it acceptsconfig: RunnableConfig, that parameter is recorded inself.func_accepts.- Both
trace=Trueandtrace=Falsebranches ofainvokerun the same kwarg-injection loop (_runnable.py:349-356):if kw == "config": kw_value = config. Theconfigpassed toainvoke(from Pregel'stask.proc.ainvoke(task.input, config)atpregel/_retry.py:138) is the task config with callbacks already bound. - Inside the middleware, passing that
configexplicitly toadispatch_custom_event(..., config=config)means the function doesn't rely onvar_child_runnable_config.get()at all. The LangChain docstring atlangchain_core/callbacks/manager.py:2574-2579even says "If using python 3.10 and async, you MUST specify the config parameter" — which is exactly this path.
trace=False only changes whether this runnable layer creates a new child callback scope. It does not affect whether the outer-layer config (with callbacks including RunJournal) is passed down to the function.
5. Verification
Ran /tmp/verify_summarize_event.py (standalone minimal reproduction):
- Minimal
AgentMiddlewaresubclass withabefore_model(self, state, runtime, config: RunnableConfig) - Calls
await adispatch_custom_event("summarization", {...}, config=config)inside create_agent(model=FakeChatModel, middleware=[probe])agent.ainvoke({...}, config={"callbacks": [RecordingHandler()]})
Result:
INFO verify: ProbeMiddleware.abefore_model called
INFO verify: config keys: ['callbacks', 'configurable', 'metadata']
INFO verify: config.callbacks type: AsyncCallbackManager
INFO verify: config.metadata: {'langgraph_step': 1, 'langgraph_node': 'probe.before_model', ...}
INFO verify: on_custom_event fired: name=summarization
run_id=019d7d19-1727-7830-aa33-648ecbee4b95
data={'summary': 'fake summary', 'replaced_count': 3}
SUCCESS: approach A is viable (config injection + adispatch work)
All five predictions held:
- ✅
config: RunnableConfigsignature triggers auto-injection despitetrace=False - ✅
config.callbacksis anAsyncCallbackManagerwithparent_run_idset - ✅
adispatch_custom_event(..., config=config)runs without error - ✅
RecordingHandler.on_custom_eventreceives the event - ✅ The received
run_idis a valid UUID tied to the running graph
Bonus finding: config.metadata contains langgraph_step and langgraph_node. These can be included in the middleware event's metadata to help the frontend position the marker on the timeline.
6. Recommended implementation (approach C)
6.1 Backend
New wrapper middleware in backend/packages/harness/deerflow/agents/lead_agent/agent.py:
from langchain.agents.middleware.summarization import SummarizationMiddleware
from langchain_core.callbacks import adispatch_custom_event
from langchain_core.runnables import RunnableConfig
class _TrackingSummarizationMiddleware(SummarizationMiddleware):
"""Wraps upstream SummarizationMiddleware to emit a ``summarization``
custom event on every actual summarization, so RunJournal can persist
a middleware:summarize row to the event store.
The upstream class does not emit events of its own. Declaring
``config: RunnableConfig`` in the override lets LangGraph's
``RunnableCallable`` inject the Pregel task config (with callbacks
and parent_run_id) regardless of ``trace=False`` on the node.
"""
async def abefore_model(self, state, runtime, config: RunnableConfig):
before_count = len(state.get("messages") or [])
result = await super().abefore_model(state, runtime)
if result is None:
return None
new_messages = result.get("messages") or []
replaced_count = max(0, before_count - len(new_messages))
summary_text = _extract_summary_text(new_messages)
await adispatch_custom_event(
"summarization",
{
"summary": summary_text,
"replaced_count": replaced_count,
},
config=config,
)
return result
def _extract_summary_text(messages: list) -> str:
"""Pull the summary string out of the HumanMessage the upstream class
injects as ``Here is a summary of the conversation to date:...``."""
for msg in messages:
if getattr(msg, "type", None) == "human":
content = getattr(msg, "content", "")
text = content if isinstance(content, str) else ""
if text.startswith("Here is a summary of the conversation to date"):
return text
return ""
Swap the existing SummarizationMiddleware() instantiation in _build_middlewares for _TrackingSummarizationMiddleware(...) with the same args.
Journal change: zero. on_custom_event("summarization", ...) in journal.py:343-362 already writes both a trace and a category="middleware" row.
History helper change: extend _get_event_store_messages in backend/app/gateway/routers/threads.py to surface category="middleware" rows as pseudo-messages, e.g.:
# In the per-event loop, after the existing message branch:
if evt.get("category") == "middleware" and evt.get("event_type") == "middleware:summarize":
meta = evt.get("metadata") or {}
messages.append({
"id": f"summary-marker-{evt['seq']}",
"type": "summary_marker",
"replaced_count": meta.get("replaced_count", 0),
"summary": (raw or {}).get("content", "") if isinstance(raw, dict) else "",
"run_id": evt.get("run_id"),
})
The marker uses a sentinel type (summary_marker) that doesn't collide with any LangChain message type, so downstream consumers that loop over messages can skip or render it explicitly.
6.2 Frontend
core/messages/utils.ts: extend the message grouping to recognizetype === "summary_marker"and yield it as its own group ("assistant:summary-marker")components/workspace/messages/message-list.tsx: add a branch in the grouped render switch that renders a distinctive inline card showingN messages condensedand a collapsible panel with the summary text- No changes to feedback logic: the marker has no
feedbackfield so the button naturally doesn't render on it
7. Risks
- Synchronous path. The upstream class has both
before_modelandabefore_model. Our wrapper only overrides the async variant. If any deer-flow code path ever uses the sync flow, those summarizations won't be captured. Mitigation: also overridebefore_modeland usedispatch_custom_event(sync variant) with the same pattern. _extract_summary_textfragility. It depends on the upstream class prefix"Here is a summary of the conversation to date"in the injectedHumanMessage. Any upstream template change breaks detection. Mitigation: pick the first newHumanMessagethat wasn't instate["messages"]before super() — resilient to template wording changes at the cost of a small diff helper.replaced_countaccuracy when concurrent updates. If another middleware in the chain also modifiesstate["messages"]before super() returns, the naivebefore_count - len(new_messages)arithmetic is wrong. Mitigation: inspect theRemoveMessage(id=REMOVE_ALL_MESSAGES)that upstream emits and count from the original input list directly.- History helper contract change. Introducing a non-LangChain-typed entry (
type="summary_marker") in the/historyresponse could break frontend code that blindly casts entries toMessage. Mitigation: the frontend change above adds an explicit branch; type-check the frontend end-to-end before merging.
8. Out of scope / deferred
- Other middleware types (Title, Guardrail, HITL) do not emit custom events either. If we want markers for those too, repeat the wrapper pattern for each. Not in this design.
- Retroactive markers for old threads (captured before this patch) are impossible without re-running the graph. Legacy threads will show the event-store-recovered messages without a marker.
- Standard mode (
make dev) — agent runs inside LangGraph Server, not the Gateway-embedded runtime.RunJournalmay not be wired there, so the custom event fires but is captured by no one. Tracked as a separate follow-up.
9. Next actions
- Land the current summarize-message-loss fixes (journal
Commandunwrap + event-store-backed/history+ inline feedback) — implementation verified, being committed now as three commits onrayhpeng/fix-persistence-new - Summarize-marker implementation (this spec) → separate follow-up PR based on the above verified design