Files
deer-flow/backend/tests/test_additional_channel_connections.py
T
Nan Gao 68ba4198b8 fix(channels): make channel connect flow deterministic (#3582)
* fix(channels): make channel connect flow deterministic

* make format

* fix(channels): apply connect-code before allowed_users on telegram and wechat

The bind-bootstrap reorder shipped for slack/dingtalk only. Telegram and
WeChat still gate _check_user/allowed_users before connect-code dispatch, so
a newly allowlisted-but-unbound user is silently rejected when binding via the
browser deep-link / connect-code flow — the same deadlock the PR fixes.

- telegram: consume the /start deep-link token before the allowed_users gate.
- wechat: handle the /connect code before the allowed_users gate, and defer
  inbound file extraction + context-token tracking past the gate so blocked
  senders no longer trigger CDN downloads or token bookkeeping.

Adds regression tests for both adapters mirroring the slack/dingtalk coverage.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(channels): enforce single-active-owner invariant at the DB layer

_revoke_other_active_owners did a SELECT-then-UPDATE in app code with no row
lock or constraint covering active rows. Under READ COMMITTED, two concurrent
connect-code consumes for the same (provider, external_account_id, workspace_id)
from different owners could each observe "no other active owner" and both commit
a connected row, leaving find_connection_by_external_identity nondeterministic.

- Add a partial unique index on (provider, external_account_id, workspace_id)
  WHERE status != 'revoked' (portable to SQLite >= 3.8.0 and PostgreSQL) so the
  database guarantees at most one non-revoked row per external identity.
- Reorder upsert_connection to revoke other owners' active rows before the new
  connected row is flushed (so the index is satisfied at commit), wrapped in a
  bounded rollback-and-retry loop. A losing concurrent writer now retries
  against the now-visible state instead of committing a duplicate.

Adds DB-constraint, revoked-slot-reuse, and concurrent-upsert regression tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(channels): harden connect-status polling primitive

pollChannelConnectionUntilResolved was a free-floating recursive setTimeout
started from onSuccess with no cancellation, no per-provider dedup, a redundant
second endpoint per tick, and an unbounded loop on a non-finite expires_in.

- Extract a framework-agnostic, cancellable poller (connect-poll.ts) that polls
  only listChannelConnections() and invalidates the providers query once when the
  bind resolves, instead of fetching both endpoints every tick.
- Guard expires_in with a finite check + default window so undefined/NaN can no
  longer produce a poll loop that runs until the page closes.
- Track one active poll handle per provider in useConnectChannelProvider via a
  ref Map: a new connect cancels the prior poll for that provider, and a useEffect
  cleanup cancels all polls on unmount.

Adds unit tests for resolve-and-stop, cancellation, and non-finite-expiry.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(channels): stop leaking blocked-sender content in DingTalk INFO log; document bind semantics

Moving the allowed_users gate past _extract_text meant the parsed-message INFO
log (text=%r, first 100 chars) fired for senders that allowed_users would have
rejected, defeating the filter's noise/privacy role. Move that log to after the
allowed_users gate so blocked senders' message text never reaches INFO logs.

Also document the two operator-relevant semantic changes in backend/CLAUDE.md:
connect-code dispatch runs before allowed_users (so allowed_users is no longer a
bind-time defense; the model relies on code confidentiality + 600s TTL + one-time
consumption), and the single-active-owner-per-external-identity transfer semantics
now backed by the partial unique index.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(channels): note connect-code-vs-allowlist and ownership transfer in operator guide

Mirror the backend/CLAUDE.md notes in the operator-facing IM_CHANNEL_CONNECTIONS.md:
connect codes are consumed before allowed_users (so a not-yet-allowlisted user can
still complete a first bind, and allowed_users is not a bind-time defense), and an
external identity has at most one active owner with last-bind-wins transfer enforced
at the DB layer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(channels): lift connect-code dispatch into Channel base class

Each adapter duplicated the ordering-sensitive boilerplate of extracting a
/connect code and guarding on the connection repo before its allowed_users gate.
The duplication is what let telegram/wechat drift and keep the gate ahead of the
bind. Centralize it:

- Move `_connection_repo` onto Channel.__init__ (removing 7 duplicate assignments).
- Add Channel._pending_connect_code(text), which guards on the repo and extracts
  the code, documenting that adapters MUST consult it before authorization so a
  browser-initiated bind can bootstrap a not-yet-authorized identity.
- Route slack, discord, feishu, dingtalk, wechat, and wecom through the helper.
  This also fixes a latent inconsistency where slack dispatched a bind even when
  no connection repo was configured.

Pure refactor — the full channel suite stays green; adds a direct unit test for
the base helper's contract.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* make format

* fix(channels): redact DingTalk parsed-message INFO log content

Log text_len instead of the first 100 chars of message text, so message
content never reaches INFO logs (the after-gate move already keeps blocked
senders out entirely). This takes over the redaction from #3584 so only this
PR touches dingtalk.py, letting the two PRs merge in any order conflict-free.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 10:15:31 +08:00

280 lines
9.9 KiB
Python

"""Connection binding tests for browser-connectable IM channels beyond Telegram/Slack/Discord."""
from __future__ import annotations
from datetime import UTC, datetime, timedelta
from unittest.mock import AsyncMock, MagicMock
from app.channels.base import Channel
from app.channels.message_bus import InboundMessage, MessageBus, OutboundMessage
class _StubChannel(Channel):
"""Minimal concrete Channel used to exercise base-class helpers directly."""
async def start(self) -> None: # pragma: no cover - not exercised
pass
async def stop(self) -> None: # pragma: no cover - not exercised
pass
async def send(self, msg: OutboundMessage) -> None: # pragma: no cover - not exercised
pass
def test_pending_connect_code_extracts_code_when_connections_configured():
channel = _StubChannel(name="stub", bus=MessageBus(), config={"connection_repo": object()})
# A connect command yields its code; ordinary text does not.
assert channel._pending_connect_code("/connect abc123") == "abc123"
assert channel._pending_connect_code("hello world") is None
def test_pending_connect_code_is_none_when_connections_disabled():
# With no connection repo, binding is not configured and connect codes are
# ignored so the message falls through to normal handling.
channel = _StubChannel(name="stub", bus=MessageBus(), config={})
assert channel._pending_connect_code("/connect abc123") is None
async def _make_repo(tmp_path, name: str):
from deerflow.persistence.channel_connections import ChannelConnectionRepository
from deerflow.persistence.engine import get_session_factory, init_engine
await init_engine("sqlite", url=f"sqlite+aiosqlite:///{tmp_path / f'{name}.db'}", sqlite_dir=str(tmp_path))
return ChannelConnectionRepository(get_session_factory())
async def _seed_state(repo, provider: str, state: str, owner_user_id: str = "deerflow-user-1") -> None:
await repo.create_oauth_state(
owner_user_id=owner_user_id,
provider=provider,
state=state,
expires_at=datetime.now(UTC) + timedelta(minutes=5),
)
def test_feishu_connect_command_binds_identity(tmp_path):
import anyio
from app.channels.feishu import FeishuChannel
async def go():
repo = await _make_repo(tmp_path, "feishu")
state = "feishu-bind-code"
await _seed_state(repo, "feishu", state)
channel = FeishuChannel(
bus=MessageBus(),
config={"app_id": "app", "app_secret": "secret", "connection_repo": repo},
)
channel._reply_card = AsyncMock()
handled = await channel._bind_connection_from_connect_code(
message_id="om-message-1",
chat_id="oc-chat-1",
user_id="ou-user-1",
code=state,
)
connections = await repo.list_connections("deerflow-user-1")
assert handled is True
assert len(connections) == 1
assert connections[0]["provider"] == "feishu"
assert connections[0]["external_account_id"] == "ou-user-1"
assert connections[0]["workspace_id"] == "oc-chat-1"
channel._reply_card.assert_awaited_once_with("om-message-1", "Feishu connected to DeerFlow.")
await repo.close()
anyio.run(go)
def test_dingtalk_connect_command_binds_identity(tmp_path):
import anyio
from app.channels.dingtalk import _CONVERSATION_TYPE_GROUP, DingTalkChannel
async def go():
repo = await _make_repo(tmp_path, "dingtalk")
state = "dingtalk-bind-code"
await _seed_state(repo, "dingtalk", state)
channel = DingTalkChannel(
bus=MessageBus(),
config={"client_id": "client", "client_secret": "secret", "connection_repo": repo},
)
channel._send_connection_reply = AsyncMock()
handled = await channel._bind_connection_from_connect_code(
conversation_type=_CONVERSATION_TYPE_GROUP,
sender_staff_id="staff-user-1",
sender_nick="Alice",
conversation_id="cid-group-1",
code=state,
)
connections = await repo.list_connections("deerflow-user-1")
assert handled is True
assert len(connections) == 1
assert connections[0]["provider"] == "dingtalk"
assert connections[0]["external_account_id"] == "staff-user-1"
assert connections[0]["external_account_name"] == "Alice"
assert connections[0]["workspace_id"] == "cid-group-1"
channel._send_connection_reply.assert_awaited_once()
await repo.close()
anyio.run(go)
def test_wechat_connect_command_binds_identity(tmp_path):
import anyio
from app.channels.wechat import WechatChannel
async def go():
repo = await _make_repo(tmp_path, "wechat")
state = "wechat-bind-code"
await _seed_state(repo, "wechat", state)
channel = WechatChannel(
bus=MessageBus(),
config={"bot_token": "token", "connection_repo": repo},
)
channel._send_connection_reply = AsyncMock()
handled = await channel._bind_connection_from_connect_code(
chat_id="wx-user-1",
context_token="ctx-1",
code=state,
)
connections = await repo.list_connections("deerflow-user-1")
assert handled is True
assert len(connections) == 1
assert connections[0]["provider"] == "wechat"
assert connections[0]["external_account_id"] == "wx-user-1"
assert connections[0]["workspace_id"] == "wx-user-1"
channel._send_connection_reply.assert_awaited_once_with("wx-user-1", "ctx-1", "WeChat connected to DeerFlow.")
await repo.close()
anyio.run(go)
def test_wecom_connect_command_binds_identity(tmp_path):
import anyio
from app.channels.wecom import WeComChannel
async def go():
repo = await _make_repo(tmp_path, "wecom")
state = "wecom-bind-code"
await _seed_state(repo, "wecom", state)
channel = WeComChannel(
bus=MessageBus(),
config={"bot_id": "bot", "bot_secret": "secret", "connection_repo": repo},
)
channel._ws_client = MagicMock()
channel._ws_client.reply = AsyncMock()
frame = {"body": {"aibotid": "bot-1", "chattype": "single"}}
handled = await channel._bind_connection_from_connect_code(
frame=frame,
user_id="wecom-user-1",
code=state,
)
connections = await repo.list_connections("deerflow-user-1")
assert handled is True
assert len(connections) == 1
assert connections[0]["provider"] == "wecom"
assert connections[0]["external_account_id"] == "wecom-user-1"
assert connections[0]["workspace_id"] == "bot-1"
channel._ws_client.reply.assert_awaited_once_with(frame, {"msgtype": "text", "text": {"content": "WeCom connected to DeerFlow."}})
await repo.close()
anyio.run(go)
def test_additional_channels_attach_owner_identity(tmp_path):
import anyio
from app.channels.dingtalk import _CONVERSATION_TYPE_GROUP, DingTalkChannel
from app.channels.feishu import FeishuChannel
from app.channels.wechat import WechatChannel
from app.channels.wecom import WeComChannel
async def go():
repo = await _make_repo(tmp_path, "additional-identity")
await repo.upsert_connection(
owner_user_id="deerflow-user-1",
provider="feishu",
external_account_id="ou-user-1",
workspace_id="oc-chat-1",
)
await repo.upsert_connection(
owner_user_id="deerflow-user-1",
provider="dingtalk",
external_account_id="staff-user-1",
workspace_id="cid-group-1",
)
await repo.upsert_connection(
owner_user_id="deerflow-user-1",
provider="wechat",
external_account_id="wx-user-1",
workspace_id="wx-user-1",
)
await repo.upsert_connection(
owner_user_id="deerflow-user-1",
provider="wecom",
external_account_id="wecom-user-1",
workspace_id="bot-1",
)
cases = [
(
FeishuChannel(bus=MessageBus(), config={"connection_repo": repo}),
InboundMessage(channel_name="feishu", chat_id="oc-chat-1", user_id="ou-user-1", text="hello"),
),
(
DingTalkChannel(bus=MessageBus(), config={"connection_repo": repo}),
InboundMessage(
channel_name="dingtalk",
chat_id="cid-group-1",
user_id="staff-user-1",
text="hello",
metadata={
"conversation_type": _CONVERSATION_TYPE_GROUP,
"conversation_id": "cid-group-1",
},
),
),
(
WechatChannel(bus=MessageBus(), config={"connection_repo": repo}),
InboundMessage(channel_name="wechat", chat_id="wx-user-1", user_id="wx-user-1", text="hello"),
),
(
WeComChannel(bus=MessageBus(), config={"connection_repo": repo}),
InboundMessage(
channel_name="wecom",
chat_id="wecom-user-1",
user_id="wecom-user-1",
text="hello",
metadata={"aibotid": "bot-1"},
),
),
]
for channel, inbound in cases:
attached = await channel._attach_connection_identity(inbound)
assert attached.owner_user_id == "deerflow-user-1"
assert attached.connection_id
assert (
attached.workspace_id
== {
"feishu": "oc-chat-1",
"dingtalk": "cid-group-1",
"wechat": "wx-user-1",
"wecom": "bot-1",
}[channel.name]
)
await repo.close()
anyio.run(go)