mirror of
https://github.com/bytedance/deer-flow.git
synced 2026-05-23 16:35:59 +00:00
e7967a7fc37547f47d305b5057ec24aae6ef1591
31 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
e7967a7fc3 | fix(frontend): hide copy for streaming assistant turn (#3176) | ||
|
|
e93f658472 |
fix(stability): resolve P0 blockers from v2.0-m1-rc1 stability audit (#3107) (#3131)
* fix(task-tool): unwrap callback manager when locating usage recorder
`config["callbacks"]` may arrive as a `BaseCallbackManager` (e.g. the
`AsyncCallbackManager` LangChain hands to async tool runs), not just a plain
list. The previous `for cb in callbacks` loop raised
`TypeError: 'AsyncCallbackManager' object is not iterable`, which
`ToolErrorHandlingMiddleware` then converted into a failed `task` ToolMessage
even though the subagent had completed internally — Ultra mode lost subagent
results and the lead agent fell back to redoing the work.
Unwrap `BaseCallbackManager.handlers` before searching for the recorder.
Refs: bytedance/deer-flow#3107 (BUG-002)
* fix(frontend): treat any task tool error as a terminal subtask failure
The subtask card status machine matched only three English prefixes (`Task
Succeeded. Result:`, `Task failed.`, `Task timed out`). Anything else fell
through to `in_progress`, so a `task` tool error wrapped by
`ToolErrorHandlingMiddleware` (`Error: Tool 'task' failed ...`) left the card
spinning forever even after the run had ended.
Extract the prefix logic into `parseSubtaskResult` and recognise any leading
`Error:` token as a terminal failure. The extracted function is unit-tested
against the legacy prefixes plus the `AsyncCallbackManager` regression
captured in the upstream issue.
Refs: bytedance/deer-flow#3107 (BUG-007)
* fix(frontend): exclude hidden, reasoning, and tool payloads from chat export
`formatThreadAsMarkdown` / `formatThreadAsJSON` iterated raw messages without
running the UI-level `isHiddenFromUIMessage` filter. Exported transcripts
therefore included `hide_from_ui` system reminders, memory injections,
provider `reasoning_content`, tool calls, and tool result messages — content
that is intentionally hidden in the chat view.
Filter the export to the user-visible transcript by default and gate
reasoning / tool calls / tool messages / hidden messages behind explicit
`ExportOptions` flags so a future debug export can opt back in without
forking the formatter.
Refs: bytedance/deer-flow#3107 (BUG-006)
* fix(gateway): route get_config through get_app_config for mtime hot reload
`get_config(request)` returned the `app.state.config` snapshot captured at
startup. The worker / lead-agent path then threaded that frozen `AppConfig`
through `RunContext` and `agent_factory`, so per-run fields edited in
`config.yaml` (notably `max_tokens`) were ignored until the gateway process
was restarted — even though `get_app_config()` already does mtime-based
reload at the bottom layer.
Route the request dependency through `get_app_config()` directly. Runtime
`ContextVar` overrides (`push_current_app_config`) and test-injected
singletons (`set_app_config`) keep working; `app.state.config` is now only
read at startup for one-shot bootstrap (logging level, IM channels,
`langgraph_runtime` engines).
`tests/test_gateway_deps_config.py` encoded the old snapshot contract and is
removed; `tests/test_gateway_config_freshness.py` replaces it with mtime,
ContextVar, and `set_app_config` coverage. `test_skills_custom_router.py` and
`test_uploads_router.py` now inject test configs via FastAPI
`dependency_overrides[get_config]` instead of mutating `app.state.config`.
Document the hot-reload boundary in `backend/CLAUDE.md` so reviewers know
which fields are picked up on the next request vs. which still require a
restart (`database`, `checkpointer`, `run_events`, `stream_bridge`,
`sandbox.use`, `log_level`, `channels.*`).
Refs: bytedance/deer-flow#3107 (BUG-001)
* fix(gateway): broaden get_config 503 to any config-load failure
Address review feedback on the previous commit:
1. Narrow exception catch removed. The old contract returned 503 whenever
`app.state.config is None`. The first cut only mapped
`FileNotFoundError`, leaving `PermissionError`, YAML parse errors, and
pydantic `ValidationError` to bubble up as 500. At the request boundary
we treat any inability to materialise the config as "configuration not
available" (503) and log the original exception so the operator still
has the stack.
2. Removed the unused `request: Request` parameter and the matching
`# noqa: ARG001`. FastAPI's `Depends()` does not require the dependency
to accept `Request`; the only call site uses the no-arg form.
3. `backend/CLAUDE.md` boundary now lists the *reason* each field is
restart-required (engine binding, singleton caching, one-shot
`apply_logging_level`, etc.), not just the field name, so reviewers do
not have to reverse-engineer the boundary themselves.
Tests parametrise four exception classes (`FileNotFoundError`,
`PermissionError`, `ValueError`, `RuntimeError`) and assert 503 for each.
Refs: bytedance/deer-flow#3107 (BUG-001)
* fix(task-tool): defend _find_usage_recorder against non-list callbacks
Address review feedback. The previous commit handled the two common shapes
LangChain hands to async tool runs — a plain `list[BaseCallbackHandler]` and
a `BaseCallbackManager` subclass — but iterated any other shape directly,
which would still raise `TypeError` if e.g. a single handler instance leaked
through without a list wrapper.
Treat any non-list, non-manager `config["callbacks"]` value as "no recorder"
rather than crash. Docstring now lists all four shapes explicitly. New tests
cover the single-handler-object case, `runtime is None`, `callbacks is None`,
and `runtime.config` being a non-dict — all required to be silent no-ops.
Refs: bytedance/deer-flow#3107 (BUG-002)
* fix(frontend): drop dead identity ternary and add opt-in export tests
Address review feedback on the previous export commit:
1. Removed the no-op `typeof msg.content === "string" ? msg.content : msg.content`
expression in `formatThreadAsJSON`. Both branches returned the same value;
the message content now flows through unchanged whether it is a string or
the rich `MessageContent[]` shape (LangChain JSON-serialises the array
structure correctly already).
2. Expanded the JSDoc on `ExportOptions` to make it clearer that the four
flags are not currently wired to any UI control — callers wanting a debug
export must build the options object explicitly. The default behaviour
continues to match the explicit prescription in
bytedance/deer-flow#3107 BUG-006.
3. Added opt-in coverage. The previous tests only exercised the
`options = {}` default path; the new cases verify each flag flips the
corresponding payload back into the export so a future debug-export
surface does not silently break the contract.
Refs: bytedance/deer-flow#3107 (BUG-006)
* fix(frontend): export subtask prefix constants and document fallback intent
Address review feedback on the previous BUG-007 commit:
1. `SUCCESS_PREFIX`, `FAILURE_PREFIX`, `TIMEOUT_PREFIX`, and the
`ERROR_WRAPPER_PATTERN` regex are now exported. The JSDoc explicitly
pins them as part of the backend↔frontend contract defined in
`task_tool.py` and `tool_error_handling_middleware.py`, so any future
structured-status migration (e.g. backend writing
`additional_kwargs.subagent_status` instead of leading text) can
reference these from one canonical place rather than redefine them.
2. The `in_progress` fallback now carries a docstring explaining the
deliberate choice — LangChain only ever emits a `ToolMessage` once the
tool itself has returned, so unrecognised content means the contract
has drifted and "still running" is the right operator signal (eagerly
marking it terminal-failed would mask the drift).
No behaviour change; this is documentation and an API export.
Refs: bytedance/deer-flow#3107 (BUG-007)
* fix(gateway): drop app.state.config snapshot and freeze run_events_config
Address @ShenAC-SAC's BUG-001 review on #3131. The previous cut still
stored an ``AppConfig`` snapshot on ``app.state.config`` for startup
bootstrap. Two follow-on hazards from that:
1. Future code touching the gateway lifespan could accidentally start
reading ``app.state.config`` again, silently regressing the request
hot path back to a stale snapshot.
2. ``get_run_context()`` paired a freshly-reloaded ``AppConfig`` with the
startup-bound ``event_store`` and a *live* ``run_events_config``
field — so an operator who edited ``run_events.backend`` mid-flight
would have produced a run context whose ``event_store`` and
``run_events_config`` referred to different backends.
Clean approach (aligned with the direction in PR #3128):
- ``lifespan()`` keeps a local ``startup_config`` variable and passes it
explicitly into ``langgraph_runtime(app, startup_config)`` and into
``start_channel_service``. No ``app.state.config`` attribute is set at
any point.
- ``langgraph_runtime`` now accepts ``startup_config`` as a required
parameter, removing the ``getattr(app.state, "config", None)`` lookup
and the "config not initialised" runtime error.
- The matching ``run_events_config`` is frozen onto ``app.state`` next
to ``run_event_store`` so ``get_run_context`` reads the two from the
same startup-time source. ``app_config`` continues to be resolved
live via ``get_app_config()``.
- ``backend/CLAUDE.md`` boundary explanation updated to spell out the
``startup_config`` / ``get_app_config()`` split.
New regression test ``test_run_context_app_config_reflects_yaml_edit``
exercises the worker-feeding path: it asserts that ``ctx.app_config``
follows a mid-flight ``config.yaml`` edit while
``ctx.run_events_config`` stays frozen to the startup snapshot the
event store was built from.
Refs: bytedance/deer-flow#3107 (BUG-001), bytedance/deer-flow#3131 review
* fix(frontend): parse Task cancelled and polling timed out as terminal
Address @ShenAC-SAC's BUG-007 review on #3131. `task_tool.py` actually
emits five terminal strings:
- `Task Succeeded. Result: …`
- `Task failed. …`
- `Task timed out. …`
- `Task cancelled by user.` ← previously matched none
- `Task polling timed out after N minutes …` ← previously matched none
The previous cut handled three; the last two fell through to the
"unknown content" branch and pushed the subtask card back to
`in_progress` even though the backend had already reached a terminal
state. Add explicit matches plus regression tests for both. The
`in_progress` fallback is now reserved for genuinely unrecognised
output (i.e. contract drift), as documented.
Refs: bytedance/deer-flow#3107 (BUG-007), bytedance/deer-flow#3131 review
* fix(frontend): sanitize JSON export content via the Markdown content path
Address @ShenAC-SAC's BUG-006 review and the Copilot inline comment on
#3131. The previous cut filtered hidden/tool messages out of the JSON
export but still serialised `msg.content` verbatim, so:
- inline `<think>…</think>` wrappers stayed in the exported `content`
even with `includeReasoning: false`,
- content-array thinking blocks leaked the `thinking` field,
- `<uploaded_files>…</uploaded_files>` markers leaked the workspace
paths a user uploaded files to.
JSON now goes through the same sanitiser the Markdown path uses
(`extractContentFromMessage` + `stripUploadedFilesTag`). Reasoning and
tool_calls remain gated behind their `ExportOptions` flags. AI / human
rows that sanitise to empty content with no opted-in reasoning or tool
calls are dropped so the JSON matches the Markdown path's `continue`
on empty assistant fragments.
New regression tests cover the three leak shapes the reviewer called
out plus the empty-content-drop case.
Refs: bytedance/deer-flow#3107 (BUG-006), bytedance/deer-flow#3131 review
* test(gateway): align lifespan stub with langgraph_runtime two-arg signature
Codex round-3 review of
|
||
|
|
dfa4eb0c1a |
[codex] fix follow-up suggestions layout (#2836)
* fix follow-up suggestions layout * fix agent chat welcome layout transition --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
d02f762ab0 |
feat: refine token usage display modes (#2329)
* feat: refine token usage display modes * docs: clarify token usage accounting semantics * fix: avoid duplicate subtask debug keys * style: format token usage tests * chore: address token attribution review feedback * Update test_token_usage_middleware.py * Update test_token_usage_middleware.py * chore: simplify token attribution fallback * fix token usage metadata follow-up handling --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
db5ad86381 |
feat: enhance chat history loading with new hooks and UI components (#2338)
* Refactor API fetch calls to use a unified fetch function; enhance chat history loading with new hooks and UI components - Replaced `fetchWithAuth` with a generic `fetch` function across various API modules for consistency. - Updated `useThreadStream` and `useThreadHistory` hooks to manage chat history loading, including loading states and pagination. - Introduced `LoadMoreHistoryIndicator` component for better user experience when loading more chat history. - Enhanced message handling in `MessageList` to accommodate new loading states and history management. - Added support for run messages in the thread context, improving the overall message handling logic. - Updated translations for loading indicators in English and Chinese. * Fix test assertions for run ordering in RunManager tests - Updated assertions in `test_list_by_thread` to reflect correct ordering of runs. - Modified `test_list_by_thread_is_stable_when_timestamps_tie` to ensure stable ordering when timestamps are tied. |
||
|
|
56d5fa3337 |
feat(persistence):Unified persistence layer with event store, feedback, and rebase cleanup (#2134)
* feat(persistence): add unified persistence layer with event store, token tracking, and feedback (#1930) * feat(persistence): add SQLAlchemy 2.0 async ORM scaffold Introduce a unified database configuration (DatabaseConfig) that controls both the LangGraph checkpointer and the DeerFlow application persistence layer from a single `database:` config section. New modules: - deerflow.config.database_config — Pydantic config with memory/sqlite/postgres backends - deerflow.persistence — async engine lifecycle, DeclarativeBase with to_dict mixin, Alembic skeleton - deerflow.runtime.runs.store — RunStore ABC + MemoryRunStore implementation Gateway integration initializes/tears down the persistence engine in the existing langgraph_runtime() context manager. Legacy checkpointer config is preserved for backward compatibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(persistence): add RunEventStore ABC + MemoryRunEventStore Phase 2-A prerequisite for event storage: adds the unified run event stream interface (RunEventStore) with an in-memory implementation, RunEventsConfig, gateway integration, and comprehensive tests (27 cases). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(persistence): add ORM models, repositories, DB/JSONL event stores, RunJournal, and API endpoints Phase 2-B: run persistence + event storage + token tracking. - ORM models: RunRow (with token fields), ThreadMetaRow, RunEventRow - RunRepository implements RunStore ABC via SQLAlchemy ORM - ThreadMetaRepository with owner access control - DbRunEventStore with trace content truncation and cursor pagination - JsonlRunEventStore with per-run files and seq recovery from disk - RunJournal (BaseCallbackHandler) captures LLM/tool/lifecycle events, accumulates token usage by caller type, buffers and flushes to store - RunManager now accepts optional RunStore for persistent backing - Worker creates RunJournal, writes human_message, injects callbacks - Gateway deps use factory functions (RunRepository when DB available) - New endpoints: messages, run messages, run events, token-usage - ThreadCreateRequest gains assistant_id field - 92 tests pass (33 new), zero regressions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(persistence): add user feedback + follow-up run association Phase 2-C: feedback and follow-up tracking. - FeedbackRow ORM model (rating +1/-1, optional message_id, comment) - FeedbackRepository with CRUD, list_by_run/thread, aggregate stats - Feedback API endpoints: create, list, stats, delete - follow_up_to_run_id in RunCreateRequest (explicit or auto-detected from latest successful run on the thread) - Worker writes follow_up_to_run_id into human_message event metadata - Gateway deps: feedback_repo factory + getter - 17 new tests (14 FeedbackRepository + 3 follow-up association) - 109 total tests pass, zero regressions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test+config: comprehensive Phase 2 test coverage + deprecate checkpointer config - config.example.yaml: deprecate standalone checkpointer section, activate unified database:sqlite as default (drives both checkpointer + app data) - New: test_thread_meta_repo.py (14 tests) — full ThreadMetaRepository coverage including check_access owner logic, list_by_owner pagination - Extended test_run_repository.py (+4 tests) — completion preserves fields, list ordering desc, limit, owner_none returns all - Extended test_run_journal.py (+8 tests) — on_chain_error, track_tokens=false, middleware no ai_message, unknown caller tokens, convenience fields, tool_error, non-summarization custom event - Extended test_run_event_store.py (+7 tests) — DB batch seq continuity, make_run_event_store factory (memory/db/jsonl/fallback/unknown) - Extended test_phase2b_integration.py (+4 tests) — create_or_reject persists, follow-up metadata, summarization in history, full DB-backed lifecycle - Fixed DB integration test to use proper fake objects (not MagicMock) for JSON-serializable metadata - 157 total Phase 2 tests pass, zero regressions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * config: move default sqlite_dir to .deer-flow/data Keep SQLite databases alongside other DeerFlow-managed data (threads, memory) under the .deer-flow/ directory instead of a top-level ./data folder. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(persistence): remove UTFJSON, use engine-level json_serializer + datetime.now() - Replace custom UTFJSON type with standard sqlalchemy.JSON in all ORM models. Add json_serializer=json.dumps(ensure_ascii=False) to all create_async_engine calls so non-ASCII text (Chinese etc.) is stored as-is in both SQLite and Postgres. - Change ORM datetime defaults from datetime.now(UTC) to datetime.now(), remove UTC imports. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(gateway): simplify deps.py with getter factory + inline repos - Replace 6 identical getter functions with _require() factory. - Inline 3 _make_*_repo() factories into langgraph_runtime(), call get_session_factory() once instead of 3 times. - Add thread_meta upsert in start_run (services.py). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(docker): add UV_EXTRAS build arg for optional dependencies Support installing optional dependency groups (e.g. postgres) at Docker build time via UV_EXTRAS build arg: UV_EXTRAS=postgres docker compose build Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(journal): fix flush, token tracking, and consolidate tests RunJournal fixes: - _flush_sync: retain events in buffer when no event loop instead of dropping them; worker's finally block flushes via async flush(). - on_llm_end: add tool_calls filter and caller=="lead_agent" guard for ai_message events; mark message IDs for dedup with record_llm_usage. - worker.py: persist completion data (tokens, message count) to RunStore in finally block. Model factory: - Auto-inject stream_usage=True for BaseChatOpenAI subclasses with custom api_base, so usage_metadata is populated in streaming responses. Test consolidation: - Delete test_phase2b_integration.py (redundant with existing tests). - Move DB-backed lifecycle test into test_run_journal.py. - Add tests for stream_usage injection in test_model_factory.py. - Clean up executor/task_tool dead journal references. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): widen content type to str|dict in all store backends Allow event content to be a dict (for structured OpenAI-format messages) in addition to plain strings. Dict values are JSON-serialized for the DB backend and deserialized on read; memory and JSONL backends handle dicts natively. Trace truncation now serializes dicts to JSON before measuring. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(events): use metadata flag instead of heuristic for dict content detection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(converters): add LangChain-to-OpenAI message format converters Pure functions langchain_to_openai_message, langchain_to_openai_completion, langchain_messages_to_openai, and _infer_finish_reason for converting LangChain BaseMessage objects to OpenAI Chat Completions format, used by RunJournal for event storage. 15 unit tests added. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(converters): handle empty list content as null, clean up test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): human_message content uses OpenAI user message format Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(events): ai_message uses OpenAI format, add ai_tool_call message event - ai_message content now uses {"role": "assistant", "content": "..."} format - New ai_tool_call message event emitted when lead_agent LLM responds with tool_calls - ai_tool_call uses langchain_to_openai_message converter for consistent format - Both events include finish_reason in metadata ("stop" or "tool_calls") Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): add tool_result message event with OpenAI tool message format Cache tool_call_id from on_tool_start keyed by run_id as fallback for on_tool_end, then emit a tool_result message event (role=tool, tool_call_id, content) after each successful tool completion. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(events): summary content uses OpenAI system message format Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): replace llm_start/llm_end with llm_request/llm_response in OpenAI format Add on_chat_model_start to capture structured prompt messages as llm_request events. Replace llm_end trace events with llm_response using OpenAI Chat Completions format. Track llm_call_index to pair request/response events. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(events): add record_middleware method for middleware trace events Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test(events): add full run sequence integration test for OpenAI content format Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(events): align message events with checkpoint format and add middleware tag injection - Message events (ai_message, ai_tool_call, tool_result, human_message) now use BaseMessage.model_dump() format, matching LangGraph checkpoint values.messages - on_tool_end extracts tool_call_id/name/status from ToolMessage objects - on_tool_error now emits tool_result message events with error status - record_middleware uses middleware:{tag} event_type and middleware category - Summarization custom events use middleware:summarize category - TitleMiddleware injects middleware:title tag via get_config() inheritance - SummarizationMiddleware model bound with middleware:summarize tag - Worker writes human_message using HumanMessage.model_dump() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(threads): switch search endpoint to threads_meta table and sync title - POST /api/threads/search now queries threads_meta table directly, removing the two-phase Store + Checkpointer scan approach - Add ThreadMetaRepository.search() with metadata/status filters - Add ThreadMetaRepository.update_display_name() for title sync - Worker syncs checkpoint title to threads_meta.display_name on run completion - Map display_name to values.title in search response for API compatibility Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(threads): history endpoint reads messages from event store - POST /api/threads/{thread_id}/history now combines two data sources: checkpointer for checkpoint_id, metadata, title, thread_data; event store for messages (complete history, not truncated by summarization) - Strip internal LangGraph metadata keys from response - Remove full channel_values serialization in favor of selective fields Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove duplicate optional-dependencies header in pyproject.toml Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(middleware): pass tagged config to TitleMiddleware ainvoke call Without the config, the middleware:title tag was not injected, causing the LLM response to be recorded as a lead_agent ai_message in run_events. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve merge conflict in .env.example Keep both DATABASE_URL (from persistence-scaffold) and WECOM credentials (from main) after the merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(persistence): address review feedback on PR #1851 - Fix naive datetime.now() → datetime.now(UTC) in all ORM models - Fix seq race condition in DbRunEventStore.put() with FOR UPDATE and UNIQUE(thread_id, seq) constraint - Encapsulate _store access in RunManager.update_run_completion() - Deduplicate _store.put() logic in RunManager via _persist_to_store() - Add update_run_completion to RunStore ABC + MemoryRunStore - Wire follow_up_to_run_id through the full create path - Add error recovery to RunJournal._flush_sync() lost-event scenario - Add migration note for search_threads breaking change - Fix test_checkpointer_none_fix mock to set database=None Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: update uv.lock Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(persistence): address 22 review comments from CodeQL, Copilot, and Code Quality Bug fixes: - Sanitize log params to prevent log injection (CodeQL) - Reset threads_meta.status to idle/error when run completes - Attach messages only to latest checkpoint in /history response - Write threads_meta on POST /threads so new threads appear in search Lint fixes: - Remove unused imports (journal.py, migrations/env.py, test_converters.py) - Convert lambda to named function (engine.py, Ruff E731) - Remove unused logger definitions in repos (Ruff F841) - Add logging to JSONL decode errors and empty except blocks - Separate assert side-effects in tests (CodeQL) - Remove unused local variables in tests (Ruff F841) - Fix max_trace_content truncation to use byte length, not char length Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: apply ruff format to persistence and runtime files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Potential fix for pull request finding 'Statement has no effect' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * refactor(runtime): introduce RunContext to reduce run_agent parameter bloat Extract checkpointer, store, event_store, run_events_config, thread_meta_repo, and follow_up_to_run_id into a frozen RunContext dataclass. Add get_run_context() in deps.py to build the base context from app.state singletons. start_run() uses dataclasses.replace() to enrich per-run fields before passing ctx to run_agent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(gateway): move sanitize_log_param to app/gateway/utils.py Extract the log-injection sanitizer from routers/threads.py into a shared utils module and rename to sanitize_log_param (public API). Eliminates the reverse service → router import in services.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * perf: use SQL aggregation for feedback stats and thread token usage Replace Python-side counting in FeedbackRepository.aggregate_by_run with a single SELECT COUNT/SUM query. Add RunStore.aggregate_tokens_by_thread abstract method with SQL GROUP BY implementation in RunRepository and Python fallback in MemoryRunStore. Simplify the thread_token_usage endpoint to delegate to the new method, eliminating the limit=10000 truncation risk. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: annotate DbRunEventStore.put() as low-frequency path Add docstring clarifying that put() opens a per-call transaction with FOR UPDATE and should only be used for infrequent writes (currently just the initial human_message event). High-throughput callers should use put_batch() instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(threads): fall back to Store search when ThreadMetaRepository is unavailable When database.backend=memory (default) or no SQL session factory is configured, search_threads now queries the LangGraph Store instead of returning 503. Returns empty list if neither Store nor repo is available. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(persistence): introduce ThreadMetaStore ABC for backend-agnostic thread metadata Add ThreadMetaStore abstract base class with create/get/search/update/delete interface. ThreadMetaRepository (SQL) now inherits from it. New MemoryThreadMetaStore wraps LangGraph BaseStore for memory-mode deployments. deps.py now always provides a non-None thread_meta_repo, eliminating all `if thread_meta_repo is not None` guards in services.py, worker.py, and routers/threads.py. search_threads no longer needs a Store fallback branch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(history): read messages from checkpointer instead of RunEventStore The /history endpoint now reads messages directly from the checkpointer's channel_values (the authoritative source) instead of querying RunEventStore.list_messages(). The RunEventStore API is preserved for other consumers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(persistence): address new Copilot review comments - feedback.py: validate thread_id/run_id before deleting feedback - jsonl.py: add path traversal protection with ID validation - run_repo.py: parse `before` to datetime for PostgreSQL compat - thread_meta_repo.py: fix pagination when metadata filter is active - database_config.py: use resolve_path for sqlite_dir consistency Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Implement skill self-evolution and skill_manage flow (#1874) * chore: ignore .worktrees directory * Add skill_manage self-evolution flow * Fix CI regressions for skill_manage * Address PR review feedback for skill evolution * fix(skill-evolution): preserve history on delete * fix(skill-evolution): tighten scanner fallbacks * docs: add skill_manage e2e evidence screenshot * fix(skill-manage): avoid blocking fs ops in session runtime --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> * fix(config): resolve sqlite_dir relative to CWD, not Paths.base_dir resolve_path() resolves relative to Paths.base_dir (.deer-flow), which double-nested the path to .deer-flow/.deer-flow/data/app.db. Use Path.resolve() (CWD-relative) instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Feature/feishu receive file (#1608) * feat(feishu): add channel file materialization hook for inbound messages - Introduce Channel.receive_file(msg, thread_id) as a base method for file materialization; default is no-op. - Implement FeishuChannel.receive_file to download files/images from Feishu messages, save to sandbox, and inject virtual paths into msg.text. - Update ChannelManager to call receive_file for any channel if msg.files is present, enabling downstream model access to user-uploaded files. - No impact on Slack/Telegram or other channels (they inherit the default no-op). * style(backend): format code with ruff for lint compliance - Auto-formatted packages/harness/deerflow/agents/factory.py and tests/test_create_deerflow_agent.py using `ruff format` - Ensured both files conform to project linting standards - Fixes CI lint check failures caused by code style issues * fix(feishu): handle file write operation asynchronously to prevent blocking * fix(feishu): rename GetMessageResourceRequest to _GetMessageResourceRequest and remove redundant code * test(feishu): add tests for receive_file method and placeholder replacement * fix(manager): remove unnecessary type casting for channel retrieval * fix(feishu): update logging messages to reflect resource handling instead of image * fix(feishu): sanitize filename by replacing invalid characters in file uploads * fix(feishu): improve filename sanitization and reorder image key handling in message processing * fix(feishu): add thread lock to prevent filename conflicts during file downloads * fix(test): correct bad merge in test_feishu_parser.py * chore: run ruff and apply formatting cleanup fix(feishu): preserve rich-text attachment order and improve fallback filename handling * fix(docker): restore gateway env vars and fix langgraph empty arg issue (#1915) Two production docker-compose.yaml bugs prevent `make up` from working: 1. Gateway missing DEER_FLOW_CONFIG_PATH and DEER_FLOW_EXTENSIONS_CONFIG_PATH environment overrides. Added in |
||
|
|
105db00987 |
feat: show token usage per assistant response (#2270)
* feat: show token usage per assistant response * fix: align client models response with token usage * fix: address token usage review feedback * docs: clarify token usage config example --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
85b7ed3cec |
fix(frontend): avoid using route new as thread id (#1967)
Co-authored-by: luoxiao6645 <luoxiao6645@gmail.com> |
||
|
|
d1baf7212b |
fix(frontend): UI polish - fix CSS typo, dark mode border, and hardcoded colors (#1942)
- Fix `font-norma` typo to `font-normal` in message-list subtask count - Fix dark mode `--border` using reddish hue (22.216) instead of neutral - Replace hardcoded `rgb(184,184,192)` in hero with `text-muted-foreground` - Replace hardcoded `bg-[#a3a1a1]` in streaming indicator with `bg-muted-foreground` - Add missing `font-sans` to welcome description `<pre>` for consistency - Make case-study-section padding responsive (`px-4 md:px-20`) Closes #1940 |
||
|
|
9735d73b83 |
fix(ui): avoid follow-up suggestion overlap (#1777)
* fix(ui): avoid follow-up suggestion overlap * fix(ui): address followup review feedback --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
0431a67b68 |
fix(frontend): filter task tool calls when rendering SubtaskCard (#1242)
Only tool calls with name === "task" should be rendered as SubtaskCard. Previously all tool_calls were mapped to IDs, causing SubtaskCard to render for non-task tool calls whose IDs were never registered in the subtask context, resulting in a TypeError on task.status. Signed-off-by: Gao Mingfei <g199209@gmail.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
cf9af1fe75 | Enhance chat UI and compatible anthropic thinking messages (#1018) | ||
|
|
7de94394d4 |
feat(agent):Supports custom agent and chat experience with refactoring (#957)
* feat: add agent management functionality with creation, editing, and deletion * feat: enhance agent creation and chat experience - Added AgentWelcome component to display agent description on new thread creation. - Improved agent name validation with availability check during agent creation. - Updated NewAgentPage to handle agent creation flow more effectively, including enhanced error handling and user feedback. - Refactored chat components to streamline message handling and improve user experience. - Introduced new bootstrap skill for personalized onboarding conversations, including detailed conversation phases and a structured SOUL.md template. - Updated localization files to reflect new features and error messages. - General code cleanup and optimizations across various components and hooks. * Refactor workspace layout and agent management components - Updated WorkspaceLayout to use useLayoutEffect for sidebar state initialization. - Removed unused AgentFormDialog and related edit functionality from AgentCard. - Introduced ArtifactTrigger component to manage artifact visibility. - Enhanced ChatBox to handle artifact selection and display. - Improved message list rendering logic to avoid loading states. - Updated localization files to remove deprecated keys and add new translations. - Refined hooks for local settings and thread management to improve performance and clarity. - Added temporal awareness guidelines to deep research skill documentation. * feat: refactor chat components and introduce thread management hooks * feat: improve artifact file detail preview logic and clean up console logs * feat: refactor lead agent creation logic and improve logging details * feat: validate agent name format and enhance error handling in agent setup * feat: simplify thread search query by removing unnecessary metadata * feat: update query key in useDeleteThread and useRenameThread for consistency * feat: add isMock parameter to thread and artifact handling for improved testing * fix: reorder import of setup_agent for consistency in builtins module * feat: append mock parameter to thread links in CaseStudySection for testing purposes * fix: update load_agent_soul calls to use cfg.name for improved clarity * fix: update date format in apply_prompt_template for consistency * feat: integrate isMock parameter into artifact content loading for enhanced testing * docs: add license section to SKILL.md for clarity and attribution * feat(agent): enhance model resolution and agent configuration handling * chore: remove unused import of _resolve_model_name from agents * feat(agent): remove unused field * fix(agent): set default value for requested_model_name in _resolve_model_name function * feat(agent): update get_available_tools call to handle optional agent_config and improve middleware function signature --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
a138d5388a |
feat: add reasoning_effort configuration support for Doubao/GPT-5 models (#947)
* feat: Add reasoning effort configuration support * Add `reasoning_effort` parameter to model config and agent initialization * Support reasoning effort levels (minimal/low/medium/high) for Doubao/GPT-5 models * Add UI controls in input box for reasoning effort selection * Update doubao-seed-1.8 example config with reasoning effort support Fixes & Cleanup: * Ensure UTF-8 encoding for file operations * Remove unused imports * fix: set reasoning_effort to None for unsupported models * fix: unit test error * Update frontend/src/components/workspace/input-box.tsx Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com> |
||
|
|
2f50e5d969 |
feat(citations): inline citation links with [citation:Title](URL)
- Backend: add citation format to lead_agent and general_purpose prompts - Add CitationLink component (Badge + HoverCard) for citation cards - MarkdownContent: detect citation: prefix in link text, render CitationLink - Message/artifact/subtask: use MarkdownContent or Streamdown with CitationLink - message-list-item: pass img via components prop (remove isHuman/img) - message-group, subtask-card: drop unused imports; fix import order (lint) Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
46048c76ce |
chore: 移除所有 Citations 相关逻辑,为后续重构做准备
- Backend: 删除 lead_agent / general_purpose 中的 citations_format 与引用相关 reminder;artifacts 下载不再对 markdown 做 citation 清洗,统一走 FileResponse,保留 Response 用于二进制 inline - Frontend: 删除 core/citations 模块、inline-citation、safe-citation-content;新增 MarkdownContent 仅做 Markdown 渲染;消息/artifact 预览与复制均使用原始 content - i18n: 移除 citations 命名空间(loadingCitations、loadingCitationsWithCount) - 技能与 demo: 措辞改为 references,demo 数据去掉 <citations> 块 - 文档: 更新 CLAUDE/AGENTS/README 描述,新增按文件 diff 的代码变更总结 Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
d9a86c10e8 |
fix(frontend): no half-finished citations, correct state when SSE ends
Citations:
- In shouldShowCitationLoading, treat any unreplaced [cite-N] in cleanContent
as show-loading (no body). Fixes Ultra and other modes when refs arrive
before the <citations> block in the stream.
- Single rule: hasUnreplacedCitationRefs(cleanContent) => true forces loading;
then isLoading && hasCitationsBlock(rawContent) for streaming indicator.
SSE end state:
- When stream finishes, SDK may set isLoading=false before client state has
the final message content, so UI stayed wrong until refresh.
- Store onFinish(state) as finalState in chat page; clear when stream starts.
- Pass messagesOverride={finalState.messages} to MessageList when not loading
so the list uses server-complete messages right after SSE ends (no refresh).
Chore:
- Stop tracking .githooks/pre-commit; add .githooks/ to .gitignore (local only).
Co-authored-by: Cursor <cursoragent@cursor.com>
---
fix(前端): 杜绝半成品引用,SSE 结束时展示正确状态
引用:
- shouldShowCitationLoading 中只要 cleanContent 仍含未替换的 [cite-N] 就
只显示加载、不渲染正文,解决流式时引用块未到就出现 [cite-1] 的问题。
- 规则:hasUnreplacedCitationRefs(cleanContent) 为真则一律显示加载;
此外 isLoading && hasCitationsBlock 用于流式时显示「正在整理引用」。
SSE 结束状态:
- 流结束时 SDK 可能先置 isLoading=false,客户端 messages 尚未包含
最终内容,导致需刷新才显示正确。
- 在对话页保存 onFinish(state) 为 finalState,流开始时清空。
- 非加载时向 MessageList 传入 messagesOverride={finalState.messages},
列表在 SSE 结束后立即用服务端完整消息渲染,无需刷新。
杂项:
- 取消跟踪 .githooks/pre-commit,.gitignore 增加 .githooks/(仅本地)。
|
||
|
|
4f9d1d524e |
feat(frontend): unify citation logic and prevent half-finished citations
- Add SafeCitationContent as single component for citation-aware body: useParsedCitations + shouldShowCitationLoading; show loading until citations complete, then render body with createCitationMarkdownComponents. Supports optional remarkPlugins, rehypePlugins, isHuman, img. - Refactor MessageListItem: assistant message body now uses SafeCitationContent only; remove duplicate useParsedCitations, shouldShowCitationLoading, createCitationMarkdownComponents and CitationsLoadingIndicator logic. Human messages keep plain AIElementMessageResponse (no citation parsing). - Use SafeCitationContent for clarification, present-files (message-list), thinking steps and write_file loading (message-group), subtask result (subtask-card). Artifact markdown preview keeps same guard (shouldShowCitationLoading) with ArtifactFilePreview. - Unify loading condition: shouldShowCitationLoading(rawContent, cleanContent, isLoading) is the single source of truth. Show loading when (isLoading && hasCitationsBlock(rawContent)) or when (hasCitationsBlock(rawContent) && hasUnreplacedCitationRefs(cleanContent)) so Pro/Ultra modes also show "loading citations" and half-finished [cite-N] never appear. - message-group write_file: replace hasCitationsBlock + threadIsLoading with shouldShowCitationLoading(fileContent, cleanContent, threadIsLoading && isLast) for consistency. - citations/utils: parse incomplete <citations> during streaming; remove isCitationsBlockIncomplete; keep hasUnreplacedCitationRefs internal; document display rule in file header. Co-authored-by: Cursor <cursoragent@cursor.com> --- feat(前端): 统一引用逻辑并杜绝半成品引用 - 新增 SafeCitationContent 作为引用正文的唯一出口:内部使用 useParsedCitations + shouldShowCitationLoading,在引用未就绪时只显示 「正在整理引用」,就绪后用 createCitationMarkdownComponents 渲染正文; 支持可选 remarkPlugins、rehypePlugins、isHuman、img。 - 重构 MessageListItem:助手消息正文仅通过 SafeCitationContent 渲染, 删除重复的 useParsedCitations、shouldShowCitationLoading、 createCitationMarkdownComponents、CitationsLoadingIndicator 等逻辑; 用户消息仍用 AIElementMessageResponse,不做引用解析。 - 澄清、present-files(message-list)、思考步骤与 write_file 加载 (message-group)、子任务结果(subtask-card)均使用 SafeCitationContent;Artifact 的 markdown 预览仍用同一 guard shouldShowCitationLoading,正文由 ArtifactFilePreview 渲染。 - 统一加载条件:shouldShowCitationLoading(rawContent, cleanContent, isLoading) 为唯一判断。在「流式中且已有引用块」或「有引用块且 cleanContent 中仍有未替换的 [cite-N]」时仅显示加载,从而在 Pro/Ultra 下也能看到「正在整理引用」,且永不出现半成品 [cite-N]。 - message-group 的 write_file:用 shouldShowCitationLoading( fileContent, cleanContent, threadIsLoading && isLast) 替代 hasCitationsBlock + threadIsLoading,与其他场景一致。 - citations/utils:流式时解析未闭合的 <citations>;移除 isCitationsBlockIncomplete;hasUnreplacedCitationRefs 保持内部使用; 在文件头注释中说明展示规则。 |
||
|
|
f146e35ee7 | feat: rewording | ||
|
|
5d4cecbb84 | refactor: optimize task handling in message list | ||
|
|
d9a52f07e7 | feat: add handling for task timeout and enhance Streamdown plugin for word animation | ||
|
|
3e2883e2a3 | feat: support subtasks | ||
|
|
d4bfed271b |
feat: display ask_clarification tool messages directly in frontend
Simplify clarification message handling by having the frontend detect and display ask_clarification tool messages directly, instead of relying on backend to add an extra AIMessage. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> |
||
|
|
78bba47769 | feat: add Titanic ADA demo | ||
|
|
3f4bcd9433 | feat: implement the first version of landing page | ||
|
|
1e4e51a80c | feat: add Todos | ||
|
|
6bf187c1c2 | fix: fix message grouping issues | ||
|
|
23dc64fab1 | feat: enhance message display | ||
|
|
962d8f04ec | feat: support artifact preview | ||
|
|
ec5bbf6b51 | feat: set artifacts layout | ||
|
|
4613d6e16e | refactor: rename |