feat(im): Add user-owned IM channel connections (#3487)

* Add user-owned IM channel connections * Fix dev startup and channel connect popup * Use async channel connect flow * Harden dev service daemon startup * Support local IM channel connections * Align IM connections with local channels * Fix safe user id digest algorithm * Address Copilot IM channel feedback * Address IM channel review comments * Support all integrated IM channel connections * Format additional channel connection tests * Keep unavailable channel connect buttons clickable * Fix IM channel provider icons * Add runtime setup for enabled IM channels * Guard global shortcut key handling * Keep configured IM channels editable * Avoid password autofill for channel secrets * Make channel threads visible to connection owners * Persist IM runtime config locally * Allow disconnecting runtime IM channels * Route no-auth channel sessions to local user * Use default user for auth-disabled local mode * Show IM channel source on threads * Prefill IM channel runtime config * Reflect IM channel runtime health * Ignore Feishu message read events * Ignore Feishu non-content message events * Let setup wizard enable IM channels * Fix frontend formatting after merge * Stabilize backend tests without local config * Isolate channel runtime config tests * Address channel connection review comments * Use sha256 user buckets with legacy migration * Ensure runtime IM channels are ready after restart * Persist disconnected IM channel state * Address channel connection review comments * Address channel connection review findings Frontend connect flow: - Open the runtime-config dialog only when a provider still needs credentials; configured providers go straight to the connect flow, so the binding-code/deep-link path is reachable from the UI again. - After saving credentials, continue into the connect flow when a user binding is still required (multi-user mode) instead of stopping at a "Connected" toast. - Extract shared provider-state helpers to core/channels/provider-state and add unit + e2e coverage for the direct-connect and configure-then-connect paths. Provider status semantics: - Report connection_status from the user's newest connection row; with no binding it is not_connected, except in auth-disabled local mode where a configured running channel is effectively connected. Concurrency and event-loop correctness: - Offload ChannelRuntimeConfigStore construction and writes, channel service construction, and Slack connection replies to threads; add a tests/blocking_io/ anchor for the runtime-config handlers. - Consume binding codes with a conditional UPDATE so a code can only be used once under concurrent workers; retry upsert_connection as an update when a concurrent insert wins the unique constraint. - Serialize ensure_channel_ready per channel so concurrent provider polls cannot double-start a channel worker. Config and migration hardening: - Stop mutating the get_app_config()-cached Telegram provider config; the runtime store now owns the UI-entered bot username. - Register channel_connections in STARTUP_ONLY_FIELDS with the standardized startup-only Field description. - Match the legacy unsafe-id bucket by recomputing its exact SHA-1 name so another user's same-prefix bucket can never be migrated. - Remove the unused Telegram process_webhook_update path and document src/core/channels in the frontend docs. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * Address PR review comments on authz scoping and channel runtime Security (review feedback from ShenAC-SAC): - Scope internal-token callers to the connection owner carried in X-DeerFlow-Owner-User-Id instead of bypassing owner checks outright, in both require_permission(owner_check=True) and the stateless run endpoints. Internal callers keep access to their own and shared/legacy threads, and may claim a default-owned channel thread for its real owner, but a leaked internal token no longer grants cross-user thread access. - Require admin privileges for POST/DELETE /api/channels/{provider}/ runtime-config: runtime credentials and channel workers are instance-wide shared state (same model as the MCP config API). Read-only provider listing stays available to all users. Performance (review feedback from willem-bd): - Skip the redundant thread channel-metadata PATCH after the first successful backfill per thread. - Reuse the per-connection Slack WebClient until its token changes instead of constructing one per outbound message. - Reconcile channel readiness for all providers concurrently in GET /api/channels/providers. Also resolve the code-quality unused-import flag in the blocking-io anchor by pre-importing the channel service via importlib. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * Fix prettier formatting in provider-state test Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * Reconcile UI runtime channel config with config reload on restart Main now reloads a channel's config.yaml entry on restart_channel() (#3514, issue #3497). Adapt the user-owned connection flow to coexist: - configure_channel() restarts with reload_config=False — the caller just supplied the authoritative config (browser-entered credentials that are never written to config.yaml), so a file reload must not clobber it with the stale on-disk entry. - _load_channel_config() re-applies the UI runtime-store overlay used at startup, so an operator-triggered restart keeps browser-entered credentials for channels without a config.yaml entry and does not resurrect a channel disconnected from the UI. - Offload the reload's disk IO (config.yaml + runtime store) with asyncio.to_thread, matching the blocking-IO policy on this branch. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 10:55:59 +00:00 · 2026-06-12 15:24:58 +08:00
parent b8f5ed360f
commit aa015462a7
96 changed files with 8585 additions and 277 deletions
@@ -274,6 +274,22 @@ def _response_metadata(base_metadata: dict[str, Any], *, pending_clarification:
    return metadata


+def _thread_channel_metadata(msg: InboundMessage) -> dict[str, Any]:
+    channel_source: dict[str, Any] = {
+        "type": "im_channel",
+        "provider": msg.channel_name,
+        "chat_id": msg.chat_id,
+    }
+    if msg.topic_id:
+        channel_source["topic_id"] = msg.topic_id
+    if msg.thread_ts:
+        channel_source["thread_ts"] = msg.thread_ts
+    if msg.connection_id:
+        channel_source["connection_id"] = msg.connection_id
+
+    return {"channel_source": channel_source}
+
+
 def _extract_text_content(content: Any) -> str:
    """Extract text from a streaming payload content field."""
    if isinstance(content, str):
@@ -440,6 +456,43 @@ def _human_input_message(content: str, *, original_content: str | None = None) -
    return message


+def _auth_disabled_owner_user_id() -> str | None:
+    try:
+        from app.gateway.auth_disabled import AUTH_DISABLED_USER_ID, is_auth_disabled
+    except Exception:
+        logger.debug("Unable to inspect auth-disabled mode for channel owner fallback", exc_info=True)
+        return None
+    return AUTH_DISABLED_USER_ID if is_auth_disabled() else None
+
+
+def _effective_owner_user_id(msg: InboundMessage) -> str | None:
+    return _auth_disabled_owner_user_id() or msg.owner_user_id
+
+
+def _apply_effective_owner(msg: InboundMessage) -> InboundMessage:
+    owner_user_id = _effective_owner_user_id(msg)
+    if owner_user_id:
+        msg.owner_user_id = owner_user_id
+    return msg
+
+
+def _owner_headers(msg: InboundMessage) -> dict[str, str] | None:
+    owner_user_id = _effective_owner_user_id(msg)
+    if not owner_user_id:
+        return None
+    return create_internal_auth_headers(owner_user_id=owner_user_id)
+
+
+def _safe_user_id_for_run(raw_user_id: str) -> str:
+    from deerflow.config.paths import get_paths
+
+    try:
+        return get_paths().prepare_user_dir_for_raw_id(raw_user_id)
+    except Exception:
+        logger.exception("Failed to prepare channel run user directory")
+        return make_safe_user_id(raw_user_id)
+
+
 def _resolve_slash_skill_command(
    text: str,
    available_skills: set[str] | None = None,
@@ -670,6 +723,7 @@ class ChannelManager:
        assistant_id: str = DEFAULT_ASSISTANT_ID,
        default_session: dict[str, Any] | None = None,
        channel_sessions: dict[str, Any] | None = None,
+        connection_repo: Any | None = None,
    ) -> None:
        self.bus = bus
        self.store = store
@@ -679,7 +733,9 @@ class ChannelManager:
        self._assistant_id = assistant_id
        self._default_session = _as_dict(default_session)
        self._channel_sessions = dict(channel_sessions or {})
+        self._connection_repo = connection_repo
        self._client = None  # lazy init — langgraph_sdk async client
+        self._channel_metadata_synced: set[str] = set()
        self._skill_storage: SkillStorage | None = None
        self._csrf_token = generate_csrf_token()
        self._semaphore: asyncio.Semaphore | None = None
@@ -728,12 +784,17 @@ class ChannelManager:
        configurable["checkpoint_ns"] = ""
        configurable["thread_id"] = thread_id

-        # ``user_id`` drives user-scoped filesystem buckets that only accept
-        # ``[A-Za-z0-9_-]``, so normalize the channel id and keep the raw value
-        # under ``channel_user_id`` for platform-facing lookups.
+        # ``user_id`` drives DeerFlow-owned memory, files, and thread buckets.
+        # For browser-connected IM channels, prefer the DeerFlow account that
+        # owns the connection. Preserve the raw platform user under
+        # ``channel_user_id`` for platform-facing lookups and audits.
        run_context_identity: dict[str, Any] = {"thread_id": thread_id}
+        owner_user_id = _effective_owner_user_id(msg)
+        if owner_user_id:
+            run_context_identity["user_id"] = _safe_user_id_for_run(owner_user_id)
+        elif msg.user_id:
+            run_context_identity["user_id"] = _safe_user_id_for_run(msg.user_id)
        if msg.user_id:
-            run_context_identity["user_id"] = make_safe_user_id(msg.user_id)
            run_context_identity["channel_user_id"] = msg.user_id

        run_context = _merge_dicts(
@@ -845,6 +906,7 @@ class ChannelManager:
            logger.error("[Manager] unhandled error in message task: %s", exc, exc_info=exc)

    async def _handle_message(self, msg: InboundMessage) -> None:
+        msg = _apply_effective_owner(msg)
        async with self._semaphore:
            try:
                if msg.msg_type == InboundMessageType.COMMAND:
@@ -877,10 +939,27 @@ class ChannelManager:

    # -- chat handling -----------------------------------------------------

-    async def _create_thread(self, client, msg: InboundMessage) -> str:
-        """Create a new thread through Gateway and store the mapping."""
-        thread = await client.threads.create()
-        thread_id = thread["thread_id"]
+    async def _lookup_thread_id(self, msg: InboundMessage) -> str | None:
+        if msg.connection_id and self._connection_repo is not None:
+            return await self._connection_repo.get_thread_id(
+                msg.connection_id,
+                msg.chat_id,
+                msg.topic_id,
+            )
+        return self.store.get_thread_id(msg.channel_name, msg.chat_id, topic_id=msg.topic_id)
+
+    async def _store_thread_id(self, msg: InboundMessage, thread_id: str) -> None:
+        if msg.connection_id and msg.owner_user_id and self._connection_repo is not None:
+            await self._connection_repo.set_thread_id(
+                connection_id=msg.connection_id,
+                owner_user_id=msg.owner_user_id,
+                provider=msg.channel_name,
+                external_conversation_id=msg.chat_id,
+                external_topic_id=msg.topic_id,
+                thread_id=thread_id,
+            )
+            return
+
        self.store.set_thread_id(
            msg.channel_name,
            msg.chat_id,
@@ -888,18 +967,49 @@ class ChannelManager:
            topic_id=msg.topic_id,
            user_id=msg.user_id,
        )
+
+    async def _create_thread(self, client, msg: InboundMessage) -> str:
+        """Create a new thread through Gateway and store the mapping."""
+        metadata = _thread_channel_metadata(msg)
+        owner_headers = _owner_headers(msg)
+        if owner_headers:
+            thread = await client.threads.create(metadata=metadata, headers=owner_headers)
+        else:
+            thread = await client.threads.create(metadata=metadata)
+        thread_id = thread["thread_id"]
+        await self._store_thread_id(msg, thread_id)
        logger.info("[Manager] new thread created through Gateway: thread_id=%s for chat_id=%s topic_id=%s", thread_id, msg.chat_id, msg.topic_id)
        return thread_id

+    async def _update_thread_channel_metadata(self, client, msg: InboundMessage, thread_id: str) -> None:
+        """Best-effort source metadata backfill for existing IM-created threads."""
+        # The metadata (provider/chat/topic) is constant for a thread, so one
+        # successful backfill per manager lifetime is enough — skip the
+        # redundant PATCH on every subsequent inbound message.
+        if thread_id in self._channel_metadata_synced:
+            return
+        update_kwargs: dict[str, Any] = {"metadata": _thread_channel_metadata(msg)}
+        if owner_headers := _owner_headers(msg):
+            update_kwargs["headers"] = owner_headers
+        try:
+            await client.threads.update(thread_id, **update_kwargs)
+        except Exception:
+            logger.debug("[Manager] failed to update channel metadata for thread_id=%s", thread_id, exc_info=True)
+            return
+        if len(self._channel_metadata_synced) > 4096:
+            self._channel_metadata_synced.clear()
+        self._channel_metadata_synced.add(thread_id)
+
    async def _handle_chat(self, msg: InboundMessage, extra_context: dict[str, Any] | None = None) -> None:
        client = self._get_client()

        # Look up existing DeerFlow thread.
        # topic_id may be None (e.g. Telegram private chats) — the store
        # handles this by using the "channel:chat_id" key without a topic suffix.
-        thread_id = self.store.get_thread_id(msg.channel_name, msg.chat_id, topic_id=msg.topic_id)
+        thread_id = await self._lookup_thread_id(msg)
        if thread_id:
            logger.info("[Manager] reusing thread: thread_id=%s for topic_id=%s", thread_id, msg.topic_id)
+            await self._update_thread_channel_metadata(client, msg, thread_id)

        # No existing thread found — create a new one
        if thread_id is None:
@@ -940,14 +1050,19 @@ class ChannelManager:
            return

        logger.info("[Manager] invoking runs.wait(thread_id=%s, text=%r)", thread_id, msg.text[:100])
+        run_kwargs: dict[str, Any] = {
+            "input": {"messages": [human_message]},
+            "config": run_config,
+            "context": run_context,
+            "multitask_strategy": "reject",
+        }
+        if owner_headers := _owner_headers(msg):
+            run_kwargs["headers"] = owner_headers
        try:
            result = await client.runs.wait(
                thread_id,
                assistant_id,
-                input={"messages": [human_message]},
-                config=run_config,
-                context=run_context,
-                multitask_strategy="reject",
+                **run_kwargs,
            )
        except Exception as exc:
            if _is_thread_busy_error(exc):
@@ -984,6 +1099,8 @@ class ChannelManager:
            artifacts=artifacts,
            attachments=attachments,
            thread_ts=msg.thread_ts,
+            connection_id=msg.connection_id,
+            owner_user_id=msg.owner_user_id,
            metadata=_response_metadata(msg.metadata, pending_clarification=pending_clarification),
        )
        logger.info("[Manager] publishing outbound message to bus: channel=%s, chat_id=%s", msg.channel_name, msg.chat_id)
@@ -1008,16 +1125,21 @@ class ChannelManager:
        last_published_text = ""
        last_publish_at = 0.0
        stream_error: BaseException | None = None
+        stream_kwargs: dict[str, Any] = {
+            "input": {"messages": [human_message]},
+            "config": run_config,
+            "context": run_context,
+            "stream_mode": ["messages-tuple", "values"],
+            "multitask_strategy": "reject",
+        }
+        if owner_headers := _owner_headers(msg):
+            stream_kwargs["headers"] = owner_headers

        try:
            async for chunk in client.runs.stream(
                thread_id,
                assistant_id,
-                input={"messages": [human_message]},
-                config=run_config,
-                context=run_context,
-                stream_mode=["messages-tuple", "values"],
-                multitask_strategy="reject",
+                **stream_kwargs,
            ):
                event = getattr(chunk, "event", "")
                data = getattr(chunk, "data", None)
@@ -1047,6 +1169,8 @@ class ChannelManager:
                        text=latest_text,
                        is_final=False,
                        thread_ts=msg.thread_ts,
+                        connection_id=msg.connection_id,
+                        owner_user_id=msg.owner_user_id,
                        metadata=_response_metadata(msg.metadata),
                    )
                )
@@ -1093,6 +1217,8 @@ class ChannelManager:
                    attachments=attachments,
                    is_final=True,
                    thread_ts=msg.thread_ts,
+                    connection_id=msg.connection_id,
+                    owner_user_id=msg.owner_user_id,
                    metadata=_response_metadata(msg.metadata, pending_clarification=pending_clarification),
                )
            )
@@ -1124,18 +1250,10 @@ class ChannelManager:
        if reply is None and command == "new":
            # Create a new thread through Gateway
            client = self._get_client()
-            thread = await client.threads.create()
-            new_thread_id = thread["thread_id"]
-            self.store.set_thread_id(
-                msg.channel_name,
-                msg.chat_id,
-                new_thread_id,
-                topic_id=msg.topic_id,
-                user_id=msg.user_id,
-            )
+            await self._create_thread(client, msg)
            reply = "New conversation started."
        elif reply is None and command == "status":
-            thread_id = self.store.get_thread_id(msg.channel_name, msg.chat_id, topic_id=msg.topic_id)
+            thread_id = await self._lookup_thread_id(msg)
            reply = f"Active thread: {thread_id}" if thread_id else "No active conversation."
        elif reply is None and command == "models":
            reply = await self._fetch_gateway("/api/models", "models")
@@ -1174,9 +1292,11 @@ class ChannelManager:
        outbound = OutboundMessage(
            channel_name=msg.channel_name,
            chat_id=msg.chat_id,
-            thread_id=self.store.get_thread_id(msg.channel_name, msg.chat_id, topic_id=msg.topic_id) or "",
+            thread_id=await self._lookup_thread_id(msg) or "",
            text=reply,
            thread_ts=msg.thread_ts,
+            connection_id=msg.connection_id,
+            owner_user_id=msg.owner_user_id,
            metadata=_slim_metadata(msg.metadata),
        )
        await self.bus.publish_outbound(outbound)
@@ -1212,9 +1332,11 @@ class ChannelManager:
        outbound = OutboundMessage(
            channel_name=msg.channel_name,
            chat_id=msg.chat_id,
-            thread_id=self.store.get_thread_id(msg.channel_name, msg.chat_id, topic_id=msg.topic_id) or "",
+            thread_id=await self._lookup_thread_id(msg) or "",
            text=error_text,
            thread_ts=msg.thread_ts,
+            connection_id=msg.connection_id,
+            owner_user_id=msg.owner_user_id,
            metadata=_slim_metadata(msg.metadata),
        )
        await self.bus.publish_outbound(outbound)