feat(telegram): stream agent replies by editing the placeholder message in place (#3534)

* docs(spec): telegram streaming output design Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * docs(plan): telegram streaming implementation plan Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(telegram): report streaming support for telegram channel Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test(channels): use slack as the non-streaming sample channel in manager tests Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(telegram): register running-reply placeholder as stream target Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test(telegram): pin last_edit_at sentinel in placeholder registration test Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(telegram): extract _send_new_message from send() Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(telegram): edit streamed message in place for non-final updates Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(telegram): finalize streamed message with overflow splitting When is_final=True arrives and stream state exists, pop the state, edit the streamed placeholder with the final text, split overflow into follow-up send_message calls, update _last_bot_message, and clear stream state. Falls back to _send_new_message when no stream state is registered. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test(telegram): exercise the not-modified handler in final edit path Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * docs: telegram channel now streams replies via message editing Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(telegram): harden final-delivery path with guarded retry and chunk retries Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(channels): accept runtime 'messages' SSE event for streaming text accumulation The embedded runtime (matching LangGraph Platform semantics) emits SSE event name 'messages' for the requested 'messages-tuple' stream mode, so the manager never accumulated token deltas and streaming channels only updated from end-of-step 'values' snapshots — on Telegram this looked like 'Working on it...' followed by the full answer in one block. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(telegram): widen stream-edit throttle to 3s in group chats Telegram caps bots at 20 messages/minute per group, stricter than the 1 msg/s per-chat guideline. Groups have negative chat ids, so pick the interval by sign. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(telegram): address review findings — thread fallback messages, bound stream registry, share stream-event constants - Fallback/new stream messages now carry reply_to_message_id parsed from thread_ts so they stay nested under the user's message (finding 1) - STREAM_MODES / MESSAGE_STREAM_EVENTS constants link the requested stream modes to the SSE event names they arrive under (finding 2) - _register_stream_message bounds the in-flight registry at 256 entries, evicting oldest, guarding against leaks when a final never arrives (finding 4) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 10:55:59 +00:00 · 2026-06-13 08:38:28 +08:00
parent 3475f7cdad
commit 839fa99237
6 changed files with 1557 additions and 23 deletions
@@ -384,10 +384,10 @@ Bridges external messaging platforms (Feishu, Slack, Telegram, Discord, DingTalk
 **Components**:
 - `message_bus.py` - Async pub/sub hub (`InboundMessage` → queue → dispatcher; `OutboundMessage` → callbacks → channels)
 - `store.py` - JSON-file persistence mapping `channel_name:chat_id[:topic_id]` → `thread_id` (keys are `channel:chat` for root conversations and `channel:chat:topic` for threaded conversations)
- `manager.py` - Core dispatcher: creates threads via `client.threads.create()`, routes commands, keeps Slack/Telegram on `client.runs.wait()`, and uses `client.runs.stream(["messages-tuple", "values"])` for Feishu incremental outbound updates
+- `manager.py` - Core dispatcher: creates threads via `client.threads.create()`, routes commands, keeps Slack/Discord on `client.runs.wait()`, and uses `client.runs.stream(["messages-tuple", "values"])` for Feishu/Telegram incremental outbound updates
 - `base.py` - Abstract `Channel` base class (start/stop/send lifecycle)
 - `service.py` - Manages lifecycle of all configured channels from `config.yaml`
- `slack.py` / `feishu.py` / `telegram.py` / `discord.py` / `dingtalk.py` - Platform-specific implementations (`feishu.py` tracks the running card `message_id` in memory and patches the same card in place; `dingtalk.py` optionally uses AI Card streaming for in-place updates when `card_template_id` is configured)
+- `slack.py` / `feishu.py` / `telegram.py` / `discord.py` / `dingtalk.py` - Platform-specific implementations (`feishu.py` tracks the running card `message_id` in memory and patches the same card in place; `telegram.py` registers the "Working on it..." placeholder as the stream target and edits it in place via `editMessageText`; `dingtalk.py` optionally uses AI Card streaming for in-place updates when `card_template_id` is configured)
 - `app/gateway/routers/channel_connections.py` - Browser-facing user connection and disconnect APIs
 - `deerflow.persistence.channel_connections` - SQL-backed user-owned connection, optional credential, connect state, and conversation store

@@ -396,12 +396,13 @@ Bridges external messaging platforms (Feishu, Slack, Telegram, Discord, DingTalk
 2. `ChannelManager._dispatch_loop()` consumes from queue
 3. For user-owned channel connections, incoming messages carry `connection_id`, `owner_user_id`, and `workspace_id`; `owner_user_id` becomes the DeerFlow run `user_id`, while the raw platform user id remains `channel_user_id`
 4. For chat: look up/create thread through Gateway's LangGraph-compatible API
-5. Feishu chat: `runs.stream()` → accumulate AI text → publish multiple outbound updates (`is_final=False`) → publish final outbound (`is_final=True`)
-6. Slack/Telegram chat: `runs.wait()` → extract final response → publish outbound
+5. Feishu/Telegram chat: `runs.stream()` → accumulate AI text → publish multiple outbound updates (`is_final=False`) → publish final outbound (`is_final=True`)
+6. Slack/Discord chat: `runs.wait()` → extract final response → publish outbound
 7. Feishu channel sends one running reply card up front, then patches the same card for each outbound update (card JSON sets `config.update_multi=true` for Feishu's patch API requirement)
-8. DingTalk AI Card mode (when `card_template_id` configured): `runs.stream()` → create card with initial text → stream updates via `PUT /v1.0/card/streaming` → finalize on `is_final=True`. Falls back to `sampleMarkdown` if card creation or streaming fails
-9. For commands (`/new`, `/status`, `/models`, `/memory`, `/help`): handle locally or query Gateway API
-10. Outbound → channel callbacks → platform reply
+8. Telegram streaming: the "Working on it..." placeholder message is registered as the stream target; non-final updates `editMessageText` it in place (channel-side throttle: 1s in private chats, 3s in groups due to Telegram's 20 msg/min group cap; 4096-char truncation; rate-limited updates dropped); the final update performs the last edit and splits >4096 texts into follow-up messages
+9. DingTalk AI Card mode (when `card_template_id` configured): `runs.stream()` → create card with initial text → stream updates via `PUT /v1.0/card/streaming` → finalize on `is_final=True`. Falls back to `sampleMarkdown` if card creation or streaming fails
+10. For commands (`/new`, `/status`, `/models`, `/memory`, `/help`): handle locally or query Gateway API
+11. Outbound → channel callbacks → platform reply

 **Configuration** (`config.yaml` -> `channels`):
 - `langgraph_url` - LangGraph-compatible Gateway API base URL (default: `http://localhost:8001/api`)
@@ -49,6 +49,11 @@ DEFAULT_RUN_CONTEXT: dict[str, Any] = {
    "subagent_enabled": False,
 }
 STREAM_UPDATE_MIN_INTERVAL_SECONDS = 0.35
+# Stream modes requested from the runtime, and the SSE event names under which
+# the message-tuple stream may arrive: the embedded runtime (and LangGraph
+# Platform) deliver the requested "messages-tuple" mode as event "messages".
+STREAM_MODES = ["messages-tuple", "values"]
+MESSAGE_STREAM_EVENTS = ("messages-tuple", "messages")
 THREAD_BUSY_MESSAGE = "This conversation is already processing another request. Please wait for it to finish and try again."

 CHANNEL_CAPABILITIES = {
@@ -56,7 +61,7 @@ CHANNEL_CAPABILITIES = {
    "discord": {"supports_streaming": False},
    "feishu": {"supports_streaming": True},
    "slack": {"supports_streaming": False},
-    "telegram": {"supports_streaming": False},
+    "telegram": {"supports_streaming": True},
    "wechat": {"supports_streaming": False},
    "wecom": {"supports_streaming": True},
 }
@@ -1135,7 +1140,7 @@ class ChannelManager:
            "input": {"messages": [human_message]},
            "config": run_config,
            "context": run_context,
-            "stream_mode": ["messages-tuple", "values"],
+            "stream_mode": list(STREAM_MODES),
            "multitask_strategy": "reject",
        }
        if owner_headers := _owner_headers(msg):
@@ -1150,7 +1155,7 @@ class ChannelManager:
                event = getattr(chunk, "event", "")
                data = getattr(chunk, "data", None)

-                if event == "messages-tuple":
+                if event in MESSAGE_STREAM_EVENTS:
                    accumulated_text, current_message_id = _accumulate_stream_text(streamed_buffers, current_message_id, data)
                    if accumulated_text:
                        latest_text = accumulated_text
@@ -5,6 +5,7 @@ from __future__ import annotations
 import asyncio
 import logging
 import threading
+import time
 from typing import Any

 from app.channels.base import Channel
@@ -13,6 +14,18 @@ from app.channels.message_bus import InboundMessage, InboundMessageType, Message

 logger = logging.getLogger(__name__)

+TELEGRAM_MAX_MESSAGE_LENGTH = 4096
+STREAM_EDIT_MIN_INTERVAL_SECONDS = 1.0
+# Groups (negative chat_id) are capped at 20 messages/minute by Telegram,
+# so stream edits there must pace well below the private-chat 1 msg/s guideline.
+STREAM_EDIT_GROUP_MIN_INTERVAL_SECONDS = 3.0
+# Bound on tracked in-flight streamed messages; entries normally clear on the
+# final update, this only guards against leaks when a final never arrives.
+MAX_TRACKED_STREAM_MESSAGES = 256
+
+# Indirection so tests can patch the clock without touching the global time module.
+_monotonic = time.monotonic
+

 class TelegramChannel(Channel):
    """Telegram bot channel using long-polling.
@@ -36,8 +49,15 @@ class TelegramChannel(Channel):
                pass
        # chat_id -> last sent message_id for threaded replies
        self._last_bot_message: dict[str, int] = {}
+        # stream_key ("chat_id:thread_ts") -> state of the in-flight streamed
+        # bot message being edited in place: {"message_id", "last_edit_at", "last_text"}
+        self._stream_messages: dict[str, dict[str, Any]] = {}
        self._connection_repo = config.get("connection_repo")

+    @property
+    def supports_streaming(self) -> bool:
+        return True
+
    async def start(self) -> None:
        if self._running:
            return
@@ -104,10 +124,117 @@ class TelegramChannel(Channel):
            logger.error("Invalid Telegram chat_id: %s", msg.chat_id)
            return

-        kwargs: dict[str, Any] = {"chat_id": chat_id, "text": msg.text}
+        key = self._stream_key(msg.chat_id, msg.thread_ts)
+
+        if not msg.is_final:
+            await self._send_stream_update(chat_id, key, msg.text, reply_to=self._parse_message_id(msg.thread_ts))
+            return
+
+        state = self._stream_messages.pop(key, None)
+        if state is not None:
+            await self._finalize_stream_message(chat_id, msg.chat_id, state, msg.text)
+            return
+
+        await self._send_new_message(chat_id, msg.chat_id, msg.text, _max_retries=_max_retries)
+
+    async def _send_stream_update(self, chat_id: int, key: str, text: str, reply_to: int | None = None) -> None:
+        """Edit the in-flight streamed message with accumulated text.
+
+        Updates are best-effort: throttled, rate-limit drops are silent.  The
+        manager always publishes a final message afterwards, which guarantees
+        delivery of the complete text.
+        """
+        if not text:
+            return
+
+        display = text
+        if len(display) > TELEGRAM_MAX_MESSAGE_LENGTH:
+            display = display[: TELEGRAM_MAX_MESSAGE_LENGTH - 1] + "…"
+
+        bot = self._application.bot
+        state = self._stream_messages.get(key)
+
+        send_kwargs: dict[str, Any] = {"chat_id": chat_id, "text": display}
+        if reply_to:
+            send_kwargs["reply_to_message_id"] = reply_to
+
+        if state is None:
+            try:
+                sent = await bot.send_message(**send_kwargs)
+            except Exception:
+                logger.exception("[Telegram] failed to start stream message in chat=%s", chat_id)
+                return
+            self._register_stream_message(key, message_id=sent.message_id, last_text=display, last_edit_at=_monotonic())
+            return
+
+        now = _monotonic()
+        min_interval = STREAM_EDIT_GROUP_MIN_INTERVAL_SECONDS if chat_id < 0 else STREAM_EDIT_MIN_INTERVAL_SECONDS
+        if now - state["last_edit_at"] < min_interval:
+            return
+        if display == state["last_text"]:
+            return
+
+        try:
+            await bot.edit_message_text(chat_id=chat_id, message_id=state["message_id"], text=display)
+        except Exception as exc:
+            if self._is_not_modified(exc):
+                state["last_text"] = display
+                return
+            if self._is_retry_after(exc):
+                logger.debug("[Telegram] stream edit rate-limited in chat=%s, dropping update", chat_id)
+                return
+            logger.warning("[Telegram] stream edit failed in chat=%s, sending new message: %s", chat_id, exc)
+            try:
+                sent = await bot.send_message(**send_kwargs)
+            except Exception:
+                logger.exception("[Telegram] failed to send fallback stream message in chat=%s", chat_id)
+                return
+            state["message_id"] = sent.message_id
+
+        state["last_edit_at"] = _monotonic()
+        state["last_text"] = display
+
+    async def _finalize_stream_message(self, chat_id: int, chat_key: str, state: dict[str, Any], text: str) -> None:
+        """Apply the final text: edit the streamed message, splitting overflow into follow-ups."""
+        bot = self._application.bot
+        chunks = self._split_message(text or "")
+
+        edited = True
+        if chunks[0] != state["last_text"]:
+            edited = await self._edit_final_chunk(bot, chat_id, state["message_id"], chunks[0])
+
+        if edited:
+            self._last_bot_message[chat_key] = state["message_id"]
+        else:
+            # Edit could not be applied (e.g. message deleted) — deliver the
+            # first chunk as a fresh message with the standard retry policy.
+            await self._send_new_message(chat_id, chat_key, chunks[0])
+
+        for chunk in chunks[1:]:
+            await self._send_new_message(chat_id, chat_key, chunk)
+
+    async def _edit_final_chunk(self, bot, chat_id: int, message_id: int, text: str) -> bool:
+        """Edit with one rate-limit retry. Returns False if the edit could not be applied."""
+        for attempt in range(2):
+            try:
+                await bot.edit_message_text(chat_id=chat_id, message_id=message_id, text=text)
+                return True
+            except Exception as exc:
+                if self._is_not_modified(exc):
+                    return True
+                if self._is_retry_after(exc) and attempt == 0:
+                    await asyncio.sleep(self._retry_after_seconds(exc))
+                    continue
+                logger.warning("[Telegram] final edit failed in chat=%s: %s", chat_id, exc)
+                return False
+        return False
+
+    async def _send_new_message(self, chat_id: int, chat_key: str, text: str, *, _max_retries: int = 3) -> int | None:
+        """Send a fresh message with retry/backoff. Returns the sent message_id."""
+        kwargs: dict[str, Any] = {"chat_id": chat_id, "text": text}

        # Reply to the last bot message in this chat for threading
-        reply_to = self._last_bot_message.get(msg.chat_id)
+        reply_to = self._last_bot_message.get(chat_key)
        if reply_to:
            kwargs["reply_to_message_id"] = reply_to

@@ -116,8 +243,8 @@ class TelegramChannel(Channel):
        for attempt in range(_max_retries):
            try:
                sent = await bot.send_message(**kwargs)
-                self._last_bot_message[msg.chat_id] = sent.message_id
-                return
+                self._last_bot_message[chat_key] = sent.message_id
+                return sent.message_id
            except Exception as exc:
                last_exc = exc
                if attempt < _max_retries - 1:
@@ -180,17 +307,63 @@ class TelegramChannel(Channel):

    # -- helpers -----------------------------------------------------------

+    @staticmethod
+    def _stream_key(chat_id: str, thread_ts: str | None) -> str:
+        return f"{chat_id}:{thread_ts or ''}"
+
+    @staticmethod
+    def _parse_message_id(value: str | None) -> int | None:
+        try:
+            return int(value) if value else None
+        except (TypeError, ValueError):
+            return None
+
+    def _register_stream_message(self, key: str, *, message_id: int, last_text: str, last_edit_at: float) -> None:
+        self._stream_messages.pop(key, None)
+        while len(self._stream_messages) >= MAX_TRACKED_STREAM_MESSAGES:
+            self._stream_messages.pop(next(iter(self._stream_messages)))
+        self._stream_messages[key] = {
+            "message_id": message_id,
+            "last_edit_at": last_edit_at,
+            "last_text": last_text,
+        }
+
+    @staticmethod
+    def _is_retry_after(exc: Exception) -> bool:
+        return getattr(exc, "retry_after", None) is not None
+
+    @staticmethod
+    def _retry_after_seconds(exc: Exception) -> float:
+        value = getattr(exc, "retry_after", 0)
+        if hasattr(value, "total_seconds"):
+            return float(value.total_seconds())
+        return float(value)
+
+    @staticmethod
+    def _is_not_modified(exc: Exception) -> bool:
+        return "message is not modified" in str(exc).lower()
+
+    @staticmethod
+    def _split_message(text: str) -> list[str]:
+        return [text[i : i + TELEGRAM_MAX_MESSAGE_LENGTH] for i in range(0, len(text), TELEGRAM_MAX_MESSAGE_LENGTH)] or [text]
+
    async def _send_running_reply(self, chat_id: str, reply_to_message_id: int) -> None:
-        """Send a 'Working on it...' reply to the user's message."""
+        """Send a 'Working on it...' reply and register it as the stream target."""
        if not self._application:
            return
        try:
            bot = self._application.bot
-            await bot.send_message(
+            sent = await bot.send_message(
                chat_id=int(chat_id),
                text="Working on it...",
                reply_to_message_id=reply_to_message_id,
            )
+            self._register_stream_message(
+                self._stream_key(chat_id, str(reply_to_message_id)),
+                message_id=sent.message_id,
+                last_text="Working on it...",
+                last_edit_at=0.0,
+            )
            logger.info("[Telegram] 'Working on it...' reply sent in chat=%s", chat_id)
        except Exception:
            logger.exception("[Telegram] failed to send running reply in chat=%s", chat_id)
@@ -873,7 +873,7 @@ class TestChannelManager:
                bus=bus,
                store=store,
                channel_sessions={
-                    "telegram": {
+                    "slack": {
                        "assistant_id": "mobile_agent",
                        "config": {"recursion_limit": 55},
                        "context": {
@@ -896,7 +896,7 @@ class TestChannelManager:

            await manager.start()

-            inbound = InboundMessage(channel_name="telegram", chat_id="chat1", user_id="user1", text="hi")
+            inbound = InboundMessage(channel_name="slack", chat_id="chat1", user_id="user1", text="hi")
            await bus.publish_inbound(inbound)
            await _wait_for(lambda: len(outbound_received) >= 1)
            await manager.stop()
@@ -1047,7 +1047,7 @@ class TestChannelManager:
                store=store,
                default_session={"context": {"is_plan_mode": True}},
                channel_sessions={
-                    "telegram": {
+                    "slack": {
                        "assistant_id": "mobile_agent",
                        "config": {"recursion_limit": 55},
                        "context": {
@@ -1080,7 +1080,7 @@ class TestChannelManager:

            await manager.start()

-            inbound = InboundMessage(channel_name="telegram", chat_id="chat1", user_id="vip-user", text="hi")
+            inbound = InboundMessage(channel_name="slack", chat_id="chat1", user_id="vip-user", text="hi")
            await bus.publish_inbound(inbound)
            await _wait_for(lambda: len(outbound_received) >= 1)
            await manager.stop()
@@ -1202,6 +1202,76 @@ class TestChannelManager:

        _run(go())

+    def test_handle_streaming_chat_accepts_runtime_messages_event(self, monkeypatch):
+        """The embedded runtime emits SSE event name "messages" (LangGraph
+        Platform semantics) for the requested "messages-tuple" stream mode —
+        the manager must accumulate text from those events too."""
+        from app.channels.manager import ChannelManager
+
+        monkeypatch.setattr("app.channels.manager.STREAM_UPDATE_MIN_INTERVAL_SECONDS", 0.0)
+
+        async def go():
+            bus = MessageBus()
+            store = ChannelStore(path=Path(tempfile.mkdtemp()) / "store.json")
+            manager = ChannelManager(bus=bus, store=store)
+
+            outbound_received = []
+
+            async def capture_outbound(msg):
+                outbound_received.append(msg)
+
+            bus.subscribe_outbound(capture_outbound)
+
+            stream_events = [
+                _make_stream_part(
+                    "messages",
+                    [
+                        {"id": "ai-1", "content": "Hello", "type": "AIMessageChunk"},
+                        {"langgraph_node": "agent"},
+                    ],
+                ),
+                _make_stream_part(
+                    "messages",
+                    [
+                        {"id": "ai-1", "content": " world", "type": "AIMessageChunk"},
+                        {"langgraph_node": "agent"},
+                    ],
+                ),
+                _make_stream_part(
+                    "values",
+                    {
+                        "messages": [
+                            {"type": "human", "content": "hi"},
+                            {"type": "ai", "content": "Hello world"},
+                        ],
+                        "artifacts": [],
+                    },
+                ),
+            ]
+
+            mock_client = _make_mock_langgraph_client()
+            mock_client.runs.stream = MagicMock(return_value=_make_async_iterator(stream_events))
+            manager._client = mock_client
+
+            await manager.start()
+
+            inbound = InboundMessage(
+                channel_name="telegram",
+                chat_id="chat1",
+                user_id="user1",
+                text="hi",
+                thread_ts="42",
+            )
+            await bus.publish_inbound(inbound)
+            await _wait_for(lambda: len(outbound_received) >= 3)
+            await manager.stop()
+
+            mock_client.runs.stream.assert_called_once()
+            assert [msg.text for msg in outbound_received] == ["Hello", "Hello world", "Hello world"]
+            assert [msg.is_final for msg in outbound_received] == [False, False, True]
+
+        _run(go())
+
    def test_handle_feishu_streaming_marks_only_final_clarification_outbound(self, monkeypatch):
        from app.channels.manager import ChannelManager

@@ -2044,7 +2114,7 @@ class TestChannelManager:
        _run(go())

    def test_none_topic_reuses_thread(self):
-        """Messages with topic_id=None should reuse the same thread (e.g. Telegram private chat)."""
+        """Messages with topic_id=None should reuse the same thread (e.g. a private/direct chat)."""
        from app.channels.manager import ChannelManager

        async def go():
@@ -2063,10 +2133,10 @@ class TestChannelManager:
            bus.subscribe_outbound(capture)
            await manager.start()

-            # Send two messages with topic_id=None (simulates Telegram private chat)
+            # Send two messages with topic_id=None (simulates a private/direct chat)
            for text in ["hello", "what did I just say?"]:
                msg = InboundMessage(
-                    channel_name="telegram",
+                    channel_name="slack",
                    chat_id="chat1",
                    user_id="user1",
                    text=text,
@@ -4766,3 +4836,439 @@ class TestSlackMarkdownConversion:
        result = _slack_md_converter.convert("# Title")
        assert "*Title*" in result
        assert "#" not in result
+
+
+# ---------------------------------------------------------------------------
+# Telegram streaming tests
+# ---------------------------------------------------------------------------
+
+
+class TestTelegramStreaming:
+    @staticmethod
+    def _make_channel_with_bot():
+        from app.channels.telegram import TelegramChannel
+
+        bus = MessageBus()
+        ch = TelegramChannel(bus=bus, config={"bot_token": "test-token"})
+
+        mock_app = MagicMock()
+        bot = SimpleNamespace()
+        bot.sent = []
+        bot.edited = []
+        bot.next_message_id = 100
+
+        async def send_message(**kwargs):
+            bot.sent.append(kwargs)
+            result = MagicMock()
+            result.message_id = bot.next_message_id
+            bot.next_message_id += 1
+            return result
+
+        async def edit_message_text(**kwargs):
+            bot.edited.append(kwargs)
+            result = MagicMock()
+            result.message_id = kwargs["message_id"]
+            return result
+
+        bot.send_message = send_message
+        bot.edit_message_text = edit_message_text
+        mock_app.bot = bot
+        ch._application = mock_app
+        return ch, bot
+
+    def test_stream_updates_edit_placeholder_in_place(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+            placeholder_id = ch._stream_messages["12345:42"]["message_id"]
+
+            update1 = OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="Hello", is_final=False, thread_ts="42")
+            await ch.send(update1)
+
+            clock["now"] += 2.0
+            update2 = OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="Hello world", is_final=False, thread_ts="42")
+            await ch.send(update2)
+
+            assert len(bot.sent) == 1  # only the placeholder
+            assert [e["message_id"] for e in bot.edited] == [placeholder_id, placeholder_id]
+            assert [e["text"] for e in bot.edited] == ["Hello", "Hello world"]
+
+        _run(go())
+
+    def test_stream_updates_throttled_within_interval(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="a", is_final=False, thread_ts="42"))
+            clock["now"] += 0.3  # within 1s window -> dropped
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="ab", is_final=False, thread_ts="42"))
+            clock["now"] += 1.0  # past window -> edited
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="abc", is_final=False, thread_ts="42"))
+
+            assert [e["text"] for e in bot.edited] == ["a", "abc"]
+
+        _run(go())
+
+    def test_stream_updates_in_group_chat_use_wider_throttle(self, monkeypatch):
+        """Telegram groups (negative chat_id) are capped at 20 messages/minute,
+        so group-chat stream edits throttle at 3s instead of 1s."""
+
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("-100123", 42)
+
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="-100123", thread_id="t1", text="a", is_final=False, thread_ts="42"))
+            clock["now"] += 1.2  # past the 1s private window, within the 3s group window -> dropped
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="-100123", thread_id="t1", text="ab", is_final=False, thread_ts="42"))
+            clock["now"] += 2.0  # 3.2s since last edit -> edited
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="-100123", thread_id="t1", text="abc", is_final=False, thread_ts="42"))
+
+            assert [e["text"] for e in bot.edited] == ["a", "abc"]
+
+        _run(go())
+
+    def test_stream_update_without_placeholder_sends_new_message(self):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="Hi", is_final=False, thread_ts="42"))
+
+            assert len(bot.sent) == 1
+            assert bot.sent[0]["text"] == "Hi"
+            # Threads under the user's message that started this turn
+            assert bot.sent[0]["reply_to_message_id"] == 42
+            assert ch._stream_messages["12345:42"]["message_id"] == 100
+
+        _run(go())
+
+    def test_stream_edit_fallback_message_threads_under_user_message(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+
+            async def edit_gone(**kwargs):
+                raise Exception("Bad Request: message to edit not found")
+
+            bot.edit_message_text = edit_gone
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="Hi", is_final=False, thread_ts="42"))
+
+            # Fallback message threads under the user's message and becomes the new stream target
+            assert bot.sent[1]["text"] == "Hi"
+            assert bot.sent[1]["reply_to_message_id"] == 42
+            assert ch._stream_messages["12345:42"]["message_id"] == 101
+
+        _run(go())
+
+    def test_stream_message_registry_is_bounded(self):
+        from app.channels.telegram import MAX_TRACKED_STREAM_MESSAGES
+
+        async def go():
+            ch, _bot = self._make_channel_with_bot()
+
+            for i in range(MAX_TRACKED_STREAM_MESSAGES + 1):
+                ch._register_stream_message(f"chat:{i}", message_id=i, last_text="x", last_edit_at=0.0)
+
+            assert len(ch._stream_messages) == MAX_TRACKED_STREAM_MESSAGES
+            assert "chat:0" not in ch._stream_messages  # oldest evicted
+            assert f"chat:{MAX_TRACKED_STREAM_MESSAGES}" in ch._stream_messages
+
+        _run(go())
+
+    def test_stream_update_truncates_long_text(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+            long_text = "x" * 5000
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text=long_text, is_final=False, thread_ts="42"))
+
+            assert len(bot.edited) == 1
+            assert len(bot.edited[0]["text"]) == 4096
+            assert bot.edited[0]["text"].endswith("…")
+
+        _run(go())
+
+    def test_stream_update_retry_after_is_dropped(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+
+            async def edit_rate_limited(**kwargs):
+                exc = Exception("Flood control exceeded")
+                exc.retry_after = 5
+                raise exc
+
+            bot.edit_message_text = edit_rate_limited
+            # Must not raise, must not send a new message
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="Hi", is_final=False, thread_ts="42"))
+            assert len(bot.sent) == 1  # placeholder only
+
+        _run(go())
+
+    def test_telegram_reports_streaming_support(self):
+        from app.channels.manager import CHANNEL_CAPABILITIES
+        from app.channels.telegram import TelegramChannel
+
+        bus = MessageBus()
+        ch = TelegramChannel(bus=bus, config={"bot_token": "test-token"})
+        assert ch.supports_streaming is True
+        assert CHANNEL_CAPABILITIES["telegram"]["supports_streaming"] is True
+
+    def test_running_reply_registers_stream_placeholder(self):
+        from app.channels.telegram import TelegramChannel
+
+        async def go():
+            bus = MessageBus()
+            ch = TelegramChannel(bus=bus, config={"bot_token": "test-token"})
+
+            mock_app = MagicMock()
+            mock_bot = AsyncMock()
+            sent = MagicMock()
+            sent.message_id = 777
+            mock_bot.send_message = AsyncMock(return_value=sent)
+            mock_app.bot = mock_bot
+            ch._application = mock_app
+
+            await ch._send_running_reply("12345", 42)
+
+            state = ch._stream_messages["12345:42"]
+            assert state["message_id"] == 777
+            assert state["last_edit_at"] == 0.0
+            assert state["last_text"] == "Working on it..."
+            mock_bot.send_message.assert_awaited_once_with(
+                chat_id=12345,
+                text="Working on it...",
+                reply_to_message_id=42,
+            )
+
+        _run(go())
+
+    def test_final_message_edits_stream_message_and_clears_state(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+            placeholder_id = ch._stream_messages["12345:42"]["message_id"]
+
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="partial", is_final=False, thread_ts="42"))
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="full answer", is_final=True, thread_ts="42"))
+
+            assert [e["text"] for e in bot.edited] == ["partial", "full answer"]
+            assert len(bot.sent) == 1  # placeholder only — final edited, not re-sent
+            assert "12345:42" not in ch._stream_messages
+            assert ch._last_bot_message["12345"] == placeholder_id
+
+        _run(go())
+
+    def test_final_message_splits_long_text(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+            long_text = "a" * 4096 + "b" * 100
+
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text=long_text, is_final=True, thread_ts="42"))
+
+            assert len(bot.edited) == 1
+            assert bot.edited[0]["text"] == "a" * 4096
+            follow_ups = bot.sent[1:]  # bot.sent[0] is the placeholder
+            assert [m["text"] for m in follow_ups] == ["b" * 100]
+            # Fake bot assigns ids sequentially: placeholder=100, follow-up chunk=101
+            assert ch._last_bot_message["12345"] == 101
+            assert "12345:42" not in ch._stream_messages
+
+        _run(go())
+
+    def test_final_message_not_modified_error_is_ignored(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="done", is_final=False, thread_ts="42"))
+
+            async def edit_not_modified(**kwargs):
+                raise Exception("Bad Request: message is not modified")
+
+            bot.edit_message_text = edit_not_modified
+            # Same text again as final — skipped via the equal-text guard:
+            # must not raise, must not send a new message
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="done", is_final=True, thread_ts="42"))
+
+            assert len(bot.sent) == 1  # placeholder only
+            assert "12345:42" not in ch._stream_messages
+
+        _run(go())
+
+    def test_final_edit_raising_not_modified_is_swallowed(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+            placeholder_id = ch._stream_messages["12345:42"]["message_id"]
+
+            async def edit_not_modified(**kwargs):
+                raise Exception("Bad Request: message is not modified")
+
+            bot.edit_message_text = edit_not_modified
+            # Final text differs from last_text, so the edit IS attempted and
+            # raises not-modified — must be swallowed, no fallback send.
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="done", is_final=True, thread_ts="42"))
+
+            assert len(bot.sent) == 1  # placeholder only
+            assert "12345:42" not in ch._stream_messages
+            assert ch._last_bot_message["12345"] == placeholder_id
+
+        _run(go())
+
+    def test_final_without_stream_state_sends_plain_message(self):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="direct", is_final=True, thread_ts=None))
+
+            assert len(bot.sent) == 1
+            assert bot.sent[0]["text"] == "direct"
+            assert len(bot.edited) == 0
+
+        _run(go())
+
+    def test_final_edit_retries_once_after_rate_limit(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            sleeps = []
+
+            async def fake_sleep(delay):
+                sleeps.append(delay)
+
+            monkeypatch.setattr("app.channels.telegram.asyncio.sleep", fake_sleep)
+
+            await ch._send_running_reply("12345", 42)
+            placeholder_id = ch._stream_messages["12345:42"]["message_id"]
+
+            real_edit = bot.edit_message_text
+            calls = {"n": 0}
+
+            async def edit_flaky(**kwargs):
+                calls["n"] += 1
+                if calls["n"] == 1:
+                    exc = Exception("Flood control exceeded")
+                    exc.retry_after = 3
+                    raise exc
+                return await real_edit(**kwargs)
+
+            bot.edit_message_text = edit_flaky
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="final", is_final=True, thread_ts="42"))
+
+            assert sleeps == [3.0]
+            assert [e["text"] for e in bot.edited] == ["final"]
+            assert len(bot.sent) == 1  # placeholder only
+            assert ch._last_bot_message["12345"] == placeholder_id
+            assert "12345:42" not in ch._stream_messages
+
+        _run(go())
+
+    def test_final_edit_double_rate_limit_falls_back_to_new_message(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            sleeps = []
+
+            async def fake_sleep(delay):
+                sleeps.append(delay)
+
+            monkeypatch.setattr("app.channels.telegram.asyncio.sleep", fake_sleep)
+
+            await ch._send_running_reply("12345", 42)
+
+            async def edit_rate_limited(**kwargs):
+                exc = Exception("Flood control exceeded")
+                exc.retry_after = 2
+                raise exc
+
+            bot.edit_message_text = edit_rate_limited
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="final", is_final=True, thread_ts="42"))
+
+            # Fallback delivered the final text as a new message (after the placeholder)
+            assert [m["text"] for m in bot.sent] == ["Working on it...", "final"]
+            assert ch._last_bot_message["12345"] == 101
+            assert "12345:42" not in ch._stream_messages
+
+        _run(go())
+
+    def test_final_overflow_chunk_send_is_retried(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            sleeps = []
+
+            async def fake_sleep(delay):
+                sleeps.append(delay)
+
+            monkeypatch.setattr("app.channels.telegram.asyncio.sleep", fake_sleep)
+
+            await ch._send_running_reply("12345", 42)
+
+            real_send = bot.send_message
+            failures = {"left": 1}
+
+            async def send_flaky(**kwargs):
+                if failures["left"] > 0:
+                    failures["left"] -= 1
+                    raise ConnectionError("transient")
+                return await real_send(**kwargs)
+
+            bot.send_message = send_flaky
+            long_text = "a" * 4096 + "b" * 10
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text=long_text, is_final=True, thread_ts="42"))
+
+            assert bot.edited[0]["text"] == "a" * 4096
+            assert [m["text"] for m in bot.sent] == ["Working on it...", "b" * 10]
+            assert ch._last_bot_message["12345"] == 101
+
+        _run(go())
@@ -0,0 +1,770 @@
+# Telegram Streaming Output Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Make the Telegram channel stream agent replies by editing one message in place (like Feishu's card patching), instead of waiting for the full result.
+
+**Architecture:** Flip `supports_streaming` for Telegram so `ChannelManager._handle_streaming_chat()` publishes incremental `is_final=False` outbound updates (it already does this for Feishu — no manager logic changes). All adaptation lives in `TelegramChannel`: the "Working on it..." placeholder message is registered as the stream target, non-final updates `edit_message_text` it (channel-side 1s throttle, 4096-char truncation, drop-on-429), and the guaranteed `is_final=True` message performs the last edit (splitting >4096 texts into follow-up messages).
+
+**Tech Stack:** Python 3.12, python-telegram-bot (mocked in tests), pytest.
+
+**Spec:** `docs/superpowers/specs/2026-06-12-telegram-streaming-design.md`
+
+**Branch:** `feat/telegram-streaming` (already created, spec committed)
+
+**Key existing facts** (verified against the codebase):
+- `OutboundMessage.is_final` defaults to `True` (`backend/app/channels/message_bus.py:119`), so error/command direct sends stay final.
+- `ChannelManager._channel_supports_streaming()` (`backend/app/channels/manager.py:746`) prefers the **live channel instance's `supports_streaming` property** and falls back to `CHANNEL_CAPABILITIES`. Both must be updated.
+- The streaming pipeline always publishes a final `is_final=True` message even on stream errors (`manager.py:1185-1224` `finally` block).
+- `_send_running_reply()` is awaited **before** the inbound message is published (`telegram.py:324-326`), so the placeholder always exists before any outbound arrives.
+- Outbound `thread_ts` equals the inbound `thread_ts`, which Telegram sets to the user message id (`telegram.py:397`). So the stream key `f"{chat_id}:{thread_ts}"` matches the placeholder registered with the user message id.
+- Existing tests to keep green: `tests/test_channels.py::TestTelegramSendRetry` (send retry semantics, `_max_retries=0` RuntimeError).
+
+**Intentional behavior change:** command replies (e.g. `/help`) and error replies now *edit* the "Working on it..." placeholder instead of sending a second message (key matches, `is_final=True`). This is improved UX and covered by a test.
+
+Run tests from `backend/`: `PYTHONPATH=. uv run pytest tests/test_channels.py -v`
+
+---
+
+### Task 1: Capability flip — Telegram reports streaming support
+
+**Files:**
+- Modify: `backend/app/channels/manager.py:59` (CHANNEL_CAPABILITIES)
+- Modify: `backend/app/channels/telegram.py` (add `supports_streaming` property)
+- Test: `backend/tests/test_channels.py` (new class `TestTelegramStreaming`)
+
+- [ ] **Step 1: Write the failing test**
+
+Append to `backend/tests/test_channels.py` (bottom of file). The file already imports `MessageBus`, `OutboundMessage`, `ChannelManager`, `pytest`, `SimpleNamespace`, `MagicMock`, `AsyncMock`, and defines `_run()`:
+
+```python
+# ---------------------------------------------------------------------------
+# Telegram streaming tests
+# ---------------------------------------------------------------------------
+
+
+class TestTelegramStreaming:
+    def test_telegram_reports_streaming_support(self):
+        from app.channels.manager import CHANNEL_CAPABILITIES
+        from app.channels.telegram import TelegramChannel
+
+        bus = MessageBus()
+        ch = TelegramChannel(bus=bus, config={"bot_token": "test-token"})
+        assert ch.supports_streaming is True
+        assert CHANNEL_CAPABILITIES["telegram"]["supports_streaming"] is True
+```
+
+- [ ] **Step 2: Run test to verify it fails**
+
+Run: `PYTHONPATH=. uv run pytest tests/test_channels.py::TestTelegramStreaming::test_telegram_reports_streaming_support -v`
+Expected: FAIL with `assert False is True` (base class property returns False).
+
+- [ ] **Step 3: Implement**
+
+In `backend/app/channels/manager.py:59` change:
+
+```python
+    "telegram": {"supports_streaming": False},
+```
+
+to:
+
+```python
+    "telegram": {"supports_streaming": True},
+```
+
+In `backend/app/channels/telegram.py`, add a property right after `__init__` (before `async def start`):
+
+```python
+    @property
+    def supports_streaming(self) -> bool:
+        return True
+```
+
+- [ ] **Step 4: Run test to verify it passes**
+
+Run: `PYTHONPATH=. uv run pytest tests/test_channels.py::TestTelegramStreaming -v`
+Expected: PASS
+
+- [ ] **Step 5: Commit**
+
+```bash
+git add backend/app/channels/manager.py backend/app/channels/telegram.py backend/tests/test_channels.py
+git commit -m "feat(telegram): report streaming support for telegram channel"
+```
+
+---
+
+### Task 2: Stream state infrastructure + placeholder registration
+
+**Files:**
+- Modify: `backend/app/channels/telegram.py` (constants, `__init__`, helpers, `_send_running_reply`)
+- Test: `backend/tests/test_channels.py` (`TestTelegramStreaming`)
+
+- [ ] **Step 1: Write the failing test**
+
+Add to `TestTelegramStreaming`:
+
+```python
+    def test_running_reply_registers_stream_placeholder(self):
+        from app.channels.telegram import TelegramChannel
+
+        async def go():
+            bus = MessageBus()
+            ch = TelegramChannel(bus=bus, config={"bot_token": "test-token"})
+
+            mock_app = MagicMock()
+            mock_bot = AsyncMock()
+            sent = MagicMock()
+            sent.message_id = 777
+            mock_bot.send_message = AsyncMock(return_value=sent)
+            mock_app.bot = mock_bot
+            ch._application = mock_app
+
+            await ch._send_running_reply("12345", 42)
+
+            state = ch._stream_messages["12345:42"]
+            assert state["message_id"] == 777
+            assert state["last_text"] == "Working on it..."
+            mock_bot.send_message.assert_awaited_once_with(
+                chat_id=12345,
+                text="Working on it...",
+                reply_to_message_id=42,
+            )
+
+        _run(go())
+```
+
+- [ ] **Step 2: Run test to verify it fails**
+
+Run: `PYTHONPATH=. uv run pytest tests/test_channels.py::TestTelegramStreaming::test_running_reply_registers_stream_placeholder -v`
+Expected: FAIL with `AttributeError: 'TelegramChannel' object has no attribute '_stream_messages'`
+
+- [ ] **Step 3: Implement**
+
+In `backend/app/channels/telegram.py`:
+
+a) Add `import time` to the imports block at the top (after `import threading`), and module constants after `logger = logging.getLogger(__name__)`:
+
+```python
+TELEGRAM_MAX_MESSAGE_LENGTH = 4096
+STREAM_EDIT_MIN_INTERVAL_SECONDS = 1.0
+
+# Indirection so tests can patch the clock without touching the global time module.
+_monotonic = time.monotonic
+```
+
+b) In `__init__`, after `self._last_bot_message: dict[str, int] = {}`:
+
+```python
+        # stream_key ("chat_id:thread_ts") -> state of the in-flight streamed
+        # bot message being edited in place: {"message_id", "last_edit_at", "last_text"}
+        self._stream_messages: dict[str, dict[str, Any]] = {}
+```
+
+c) Add helpers in the `# -- helpers --` section (before `_send_running_reply`):
+
+```python
+    @staticmethod
+    def _stream_key(chat_id: str, thread_ts: str | None) -> str:
+        return f"{chat_id}:{thread_ts or ''}"
+
+    @staticmethod
+    def _is_retry_after(exc: Exception) -> bool:
+        return getattr(exc, "retry_after", None) is not None
+
+    @staticmethod
+    def _retry_after_seconds(exc: Exception) -> float:
+        value = getattr(exc, "retry_after", 0)
+        if hasattr(value, "total_seconds"):
+            return float(value.total_seconds())
+        return float(value)
+
+    @staticmethod
+    def _is_not_modified(exc: Exception) -> bool:
+        return "message is not modified" in str(exc).lower()
+
+    @staticmethod
+    def _split_message(text: str) -> list[str]:
+        return [text[i : i + TELEGRAM_MAX_MESSAGE_LENGTH] for i in range(0, len(text), TELEGRAM_MAX_MESSAGE_LENGTH)] or [text]
+```
+
+d) Replace `_send_running_reply` (`telegram.py:183-196`) with:
+
+```python
+    async def _send_running_reply(self, chat_id: str, reply_to_message_id: int) -> None:
+        """Send a 'Working on it...' reply and register it as the stream target."""
+        if not self._application:
+            return
+        try:
+            bot = self._application.bot
+            sent = await bot.send_message(
+                chat_id=int(chat_id),
+                text="Working on it...",
+                reply_to_message_id=reply_to_message_id,
+            )
+            self._stream_messages[self._stream_key(chat_id, str(reply_to_message_id))] = {
+                "message_id": sent.message_id,
+                "last_edit_at": 0.0,
+                "last_text": "Working on it...",
+            }
+            logger.info("[Telegram] 'Working on it...' reply sent in chat=%s", chat_id)
+        except Exception:
+            logger.exception("[Telegram] failed to send running reply in chat=%s", chat_id)
+```
+
+- [ ] **Step 4: Run tests to verify pass (including existing retry tests)**
+
+Run: `PYTHONPATH=. uv run pytest tests/test_channels.py::TestTelegramStreaming tests/test_channels.py::TestTelegramSendRetry -v`
+Expected: all PASS
+
+- [ ] **Step 5: Commit**
+
+```bash
+git add backend/app/channels/telegram.py backend/tests/test_channels.py
+git commit -m "feat(telegram): register running-reply placeholder as stream target"
+```
+
+---
+
+### Task 3: Refactor `send()` — extract `_send_new_message` (no behavior change)
+
+**Files:**
+- Modify: `backend/app/channels/telegram.py:97-137` (`send`)
+- Test: existing `tests/test_channels.py::TestTelegramSendRetry` must stay green
+
+- [ ] **Step 1: Replace `send()` with the dispatching version + extracted helper**
+
+Replace the whole `send()` method (`telegram.py:97-137`) with:
+
+```python
+    async def send(self, msg: OutboundMessage, *, _max_retries: int = 3) -> None:
+        if not self._application:
+            return
+
+        try:
+            chat_id = int(msg.chat_id)
+        except (ValueError, TypeError):
+            logger.error("Invalid Telegram chat_id: %s", msg.chat_id)
+            return
+
+        await self._send_new_message(chat_id, msg.chat_id, msg.text, _max_retries=_max_retries)
+
+    async def _send_new_message(self, chat_id: int, chat_key: str, text: str, *, _max_retries: int = 3) -> int | None:
+        """Send a fresh message with retry/backoff. Returns the sent message_id."""
+        kwargs: dict[str, Any] = {"chat_id": chat_id, "text": text}
+
+        # Reply to the last bot message in this chat for threading
+        reply_to = self._last_bot_message.get(chat_key)
+        if reply_to:
+            kwargs["reply_to_message_id"] = reply_to
+
+        bot = self._application.bot
+        last_exc: Exception | None = None
+        for attempt in range(_max_retries):
+            try:
+                sent = await bot.send_message(**kwargs)
+                self._last_bot_message[chat_key] = sent.message_id
+                return sent.message_id
+            except Exception as exc:
+                last_exc = exc
+                if attempt < _max_retries - 1:
+                    delay = 2**attempt  # 1s, 2s
+                    logger.warning(
+                        "[Telegram] send failed (attempt %d/%d), retrying in %ds: %s",
+                        attempt + 1,
+                        _max_retries,
+                        delay,
+                        exc,
+                    )
+                    await asyncio.sleep(delay)
+
+        logger.error("[Telegram] send failed after %d attempts: %s", _max_retries, last_exc)
+        if last_exc is None:
+            raise RuntimeError("Telegram send failed without an exception from any attempt")
+        raise last_exc
+```
+
+- [ ] **Step 2: Run existing retry tests to verify no regression**
+
+Run: `PYTHONPATH=. uv run pytest tests/test_channels.py::TestTelegramSendRetry tests/test_channels.py::TestTelegramStreaming -v`
+Expected: all PASS (pure refactor)
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add backend/app/channels/telegram.py
+git commit -m "refactor(telegram): extract _send_new_message from send()"
+```
+
+---
+
+### Task 4: Non-final stream updates — edit in place with throttle/truncate/fallback
+
+**Files:**
+- Modify: `backend/app/channels/telegram.py` (`send`, new `_send_stream_update`)
+- Test: `backend/tests/test_channels.py` (`TestTelegramStreaming`)
+
+- [ ] **Step 1: Write the failing tests**
+
+Add to `TestTelegramStreaming`. First add a shared fake-bot factory at the top of the class:
+
+```python
+    @staticmethod
+    def _make_channel_with_bot():
+        from app.channels.telegram import TelegramChannel
+
+        bus = MessageBus()
+        ch = TelegramChannel(bus=bus, config={"bot_token": "test-token"})
+
+        mock_app = MagicMock()
+        bot = SimpleNamespace()
+        bot.sent = []
+        bot.edited = []
+        bot.next_message_id = 100
+
+        async def send_message(**kwargs):
+            bot.sent.append(kwargs)
+            result = MagicMock()
+            result.message_id = bot.next_message_id
+            bot.next_message_id += 1
+            return result
+
+        async def edit_message_text(**kwargs):
+            bot.edited.append(kwargs)
+            result = MagicMock()
+            result.message_id = kwargs["message_id"]
+            return result
+
+        bot.send_message = send_message
+        bot.edit_message_text = edit_message_text
+        mock_app.bot = bot
+        ch._application = mock_app
+        return ch, bot
+```
+
+Then the tests:
+
+```python
+    def test_stream_updates_edit_placeholder_in_place(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+            placeholder_id = ch._stream_messages["12345:42"]["message_id"]
+
+            update1 = OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="Hello", is_final=False, thread_ts="42")
+            await ch.send(update1)
+
+            clock["now"] += 2.0
+            update2 = OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="Hello world", is_final=False, thread_ts="42")
+            await ch.send(update2)
+
+            assert len(bot.sent) == 1  # only the placeholder
+            assert [e["message_id"] for e in bot.edited] == [placeholder_id, placeholder_id]
+            assert [e["text"] for e in bot.edited] == ["Hello", "Hello world"]
+
+        _run(go())
+
+    def test_stream_updates_throttled_within_interval(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="a", is_final=False, thread_ts="42"))
+            clock["now"] += 0.3  # within 1s window -> dropped
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="ab", is_final=False, thread_ts="42"))
+            clock["now"] += 1.0  # past window -> edited
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="abc", is_final=False, thread_ts="42"))
+
+            assert [e["text"] for e in bot.edited] == ["a", "abc"]
+
+        _run(go())
+
+    def test_stream_update_without_placeholder_sends_new_message(self):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="Hi", is_final=False, thread_ts="42"))
+
+            assert len(bot.sent) == 1
+            assert bot.sent[0]["text"] == "Hi"
+            assert ch._stream_messages["12345:42"]["message_id"] == 100
+
+        _run(go())
+
+    def test_stream_update_truncates_long_text(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+            long_text = "x" * 5000
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text=long_text, is_final=False, thread_ts="42"))
+
+            assert len(bot.edited) == 1
+            assert len(bot.edited[0]["text"]) == 4096
+            assert bot.edited[0]["text"].endswith("…")
+
+        _run(go())
+
+    def test_stream_update_retry_after_is_dropped(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+
+            async def edit_rate_limited(**kwargs):
+                exc = Exception("Flood control exceeded")
+                exc.retry_after = 5
+                raise exc
+
+            bot.edit_message_text = edit_rate_limited
+            # Must not raise, must not send a new message
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="Hi", is_final=False, thread_ts="42"))
+            assert len(bot.sent) == 1  # placeholder only
+
+        _run(go())
+```
+
+- [ ] **Step 2: Run tests to verify they fail**
+
+Run: `PYTHONPATH=. uv run pytest tests/test_channels.py::TestTelegramStreaming -v`
+Expected: the new tests FAIL (current `send()` sends new messages for every outbound; `bot.sent` counts are wrong).
+
+- [ ] **Step 3: Implement**
+
+In `backend/app/channels/telegram.py`, replace the `send()` body and add `_send_stream_update`:
+
+```python
+    async def send(self, msg: OutboundMessage, *, _max_retries: int = 3) -> None:
+        if not self._application:
+            return
+
+        try:
+            chat_id = int(msg.chat_id)
+        except (ValueError, TypeError):
+            logger.error("Invalid Telegram chat_id: %s", msg.chat_id)
+            return
+
+        key = self._stream_key(msg.chat_id, msg.thread_ts)
+
+        if not msg.is_final:
+            await self._send_stream_update(chat_id, key, msg.text)
+            return
+
+        await self._send_new_message(chat_id, msg.chat_id, msg.text, _max_retries=_max_retries)
+
+    async def _send_stream_update(self, chat_id: int, key: str, text: str) -> None:
+        """Edit the in-flight streamed message with accumulated text.
+
+        Updates are best-effort: throttled, rate-limit drops are silent.  The
+        manager always publishes a final message afterwards, which guarantees
+        delivery of the complete text.
+        """
+        if not text:
+            return
+
+        display = text
+        if len(display) > TELEGRAM_MAX_MESSAGE_LENGTH:
+            display = display[: TELEGRAM_MAX_MESSAGE_LENGTH - 1] + "…"
+
+        bot = self._application.bot
+        state = self._stream_messages.get(key)
+
+        if state is None:
+            try:
+                sent = await bot.send_message(chat_id=chat_id, text=display)
+            except Exception:
+                logger.exception("[Telegram] failed to start stream message in chat=%s", chat_id)
+                return
+            self._stream_messages[key] = {
+                "message_id": sent.message_id,
+                "last_edit_at": _monotonic(),
+                "last_text": display,
+            }
+            return
+
+        now = _monotonic()
+        if now - state["last_edit_at"] < STREAM_EDIT_MIN_INTERVAL_SECONDS:
+            return
+        if display == state["last_text"]:
+            return
+
+        try:
+            await bot.edit_message_text(chat_id=chat_id, message_id=state["message_id"], text=display)
+        except Exception as exc:
+            if self._is_not_modified(exc):
+                state["last_text"] = display
+                return
+            if self._is_retry_after(exc):
+                logger.debug("[Telegram] stream edit rate-limited in chat=%s, dropping update", chat_id)
+                return
+            logger.warning("[Telegram] stream edit failed in chat=%s, sending new message: %s", chat_id, exc)
+            try:
+                sent = await bot.send_message(chat_id=chat_id, text=display)
+            except Exception:
+                logger.exception("[Telegram] failed to send fallback stream message in chat=%s", chat_id)
+                return
+            state["message_id"] = sent.message_id
+
+        state["last_edit_at"] = _monotonic()
+        state["last_text"] = display
+```
+
+- [ ] **Step 4: Run tests to verify pass**
+
+Run: `PYTHONPATH=. uv run pytest tests/test_channels.py::TestTelegramStreaming tests/test_channels.py::TestTelegramSendRetry -v`
+Expected: all PASS. Note `TestTelegramSendRetry` still passes because its messages default to `is_final=True` with no registered stream state.
+
+- [ ] **Step 5: Commit**
+
+```bash
+git add backend/app/channels/telegram.py backend/tests/test_channels.py
+git commit -m "feat(telegram): edit streamed message in place for non-final updates"
+```
+
+---
+
+### Task 5: Final message — last edit, >4096 split, cleanup
+
+**Files:**
+- Modify: `backend/app/channels/telegram.py` (`send`, new `_finalize_stream_message`)
+- Test: `backend/tests/test_channels.py` (`TestTelegramStreaming`)
+
+- [ ] **Step 1: Write the failing tests**
+
+Add to `TestTelegramStreaming`:
+
+```python
+    def test_final_message_edits_stream_message_and_clears_state(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+            placeholder_id = ch._stream_messages["12345:42"]["message_id"]
+
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="partial", is_final=False, thread_ts="42"))
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="full answer", is_final=True, thread_ts="42"))
+
+            assert [e["text"] for e in bot.edited] == ["partial", "full answer"]
+            assert len(bot.sent) == 1  # placeholder only — final edited, not re-sent
+            assert "12345:42" not in ch._stream_messages
+            assert ch._last_bot_message["12345"] == placeholder_id
+
+        _run(go())
+
+    def test_final_message_splits_long_text(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+            long_text = "a" * 4096 + "b" * 100
+
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text=long_text, is_final=True, thread_ts="42"))
+
+            assert len(bot.edited) == 1
+            assert bot.edited[0]["text"] == "a" * 4096
+            follow_ups = bot.sent[1:]  # bot.sent[0] is the placeholder
+            assert [m["text"] for m in follow_ups] == ["b" * 100]
+            # Fake bot assigns ids sequentially: placeholder=100, follow-up chunk=101
+            assert ch._last_bot_message["12345"] == 101
+            assert "12345:42" not in ch._stream_messages
+
+        _run(go())
+
+    def test_final_message_not_modified_error_is_ignored(self, monkeypatch):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            clock = {"now": 1000.0}
+            monkeypatch.setattr("app.channels.telegram._monotonic", lambda: clock["now"])
+
+            await ch._send_running_reply("12345", 42)
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="done", is_final=False, thread_ts="42"))
+
+            async def edit_not_modified(**kwargs):
+                raise Exception("Bad Request: message is not modified")
+
+            bot.edit_message_text = edit_not_modified
+            # Same text again as final — must not raise, must not send a new message
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="done", is_final=True, thread_ts="42"))
+
+            assert len(bot.sent) == 1  # placeholder only
+            assert "12345:42" not in ch._stream_messages
+
+        _run(go())
+
+    def test_final_without_stream_state_sends_plain_message(self):
+        async def go():
+            ch, bot = self._make_channel_with_bot()
+
+            await ch.send(OutboundMessage(channel_name="telegram", chat_id="12345", thread_id="t1", text="direct", is_final=True, thread_ts=None))
+
+            assert len(bot.sent) == 1
+            assert bot.sent[0]["text"] == "direct"
+            assert len(bot.edited) == 0
+
+        _run(go())
+```
+
+- [ ] **Step 2: Run tests to verify they fail**
+
+Run: `PYTHONPATH=. uv run pytest tests/test_channels.py::TestTelegramStreaming -v`
+Expected: new tests FAIL (final messages currently always go through `_send_new_message`).
+
+- [ ] **Step 3: Implement**
+
+In `backend/app/channels/telegram.py`, update `send()`'s final branch and add `_finalize_stream_message`:
+
+```python
+    async def send(self, msg: OutboundMessage, *, _max_retries: int = 3) -> None:
+        if not self._application:
+            return
+
+        try:
+            chat_id = int(msg.chat_id)
+        except (ValueError, TypeError):
+            logger.error("Invalid Telegram chat_id: %s", msg.chat_id)
+            return
+
+        key = self._stream_key(msg.chat_id, msg.thread_ts)
+
+        if not msg.is_final:
+            await self._send_stream_update(chat_id, key, msg.text)
+            return
+
+        state = self._stream_messages.pop(key, None)
+        if state is not None:
+            await self._finalize_stream_message(chat_id, msg.chat_id, state, msg.text)
+            return
+
+        await self._send_new_message(chat_id, msg.chat_id, msg.text, _max_retries=_max_retries)
+
+    async def _finalize_stream_message(self, chat_id: int, chat_key: str, state: dict[str, Any], text: str) -> None:
+        """Apply the final text: edit the streamed message, splitting overflow into follow-ups."""
+        bot = self._application.bot
+        chunks = self._split_message(text or "")
+        last_message_id = state["message_id"]
+
+        if chunks[0] != state["last_text"]:
+            try:
+                await bot.edit_message_text(chat_id=chat_id, message_id=state["message_id"], text=chunks[0])
+            except Exception as exc:
+                if self._is_not_modified(exc):
+                    pass
+                elif self._is_retry_after(exc):
+                    await asyncio.sleep(self._retry_after_seconds(exc))
+                    await bot.edit_message_text(chat_id=chat_id, message_id=state["message_id"], text=chunks[0])
+                else:
+                    logger.warning("[Telegram] final edit failed in chat=%s, sending new message: %s", chat_id, exc)
+                    sent = await bot.send_message(chat_id=chat_id, text=chunks[0])
+                    last_message_id = sent.message_id
+
+        for chunk in chunks[1:]:
+            sent = await bot.send_message(chat_id=chat_id, text=chunk)
+            last_message_id = sent.message_id
+
+        self._last_bot_message[chat_key] = last_message_id
+```
+
+- [ ] **Step 4: Run the full channel test file**
+
+Run: `PYTHONPATH=. uv run pytest tests/test_channels.py -v`
+Expected: all PASS (including Feishu/WeCom/manager tests — none of their code paths were touched).
+
+- [ ] **Step 5: Run telegram connection tests too**
+
+Run: `PYTHONPATH=. uv run pytest tests/test_telegram_channel_connections.py -v`
+Expected: all PASS.
+
+- [ ] **Step 6: Commit**
+
+```bash
+git add backend/app/channels/telegram.py backend/tests/test_channels.py
+git commit -m "feat(telegram): finalize streamed message with overflow splitting"
+```
+
+---
+
+### Task 6: Documentation + full test suite
+
+**Files:**
+- Modify: `backend/CLAUDE.md` (IM Channels section)
+- Modify: `README.md` (only if it mentions Telegram non-streaming — check first)
+
+- [ ] **Step 1: Update backend/CLAUDE.md**
+
+In the "IM Channels System" section, two spots:
+
+1. The `manager.py` component bullet currently reads:
+
+> `manager.py` - Core dispatcher: creates threads via `client.threads.create()`, routes commands, keeps Slack/Telegram on `client.runs.wait()`, and uses `client.runs.stream(["messages-tuple", "values"])` for Feishu incremental outbound updates
+
+Change to:
+
+> `manager.py` - Core dispatcher: creates threads via `client.threads.create()`, routes commands, keeps Slack/Discord on `client.runs.wait()`, and uses `client.runs.stream(["messages-tuple", "values"])` for Feishu/Telegram incremental outbound updates
+
+2. The Message Flow items 5-6 currently read:
+
+> 5. Feishu chat: `runs.stream()` → accumulate AI text → publish multiple outbound updates (`is_final=False`) → publish final outbound (`is_final=True`)
+> 6. Slack/Telegram chat: `runs.wait()` → extract final response → publish outbound
+
+Change to:
+
+> 5. Feishu/Telegram chat: `runs.stream()` → accumulate AI text → publish multiple outbound updates (`is_final=False`) → publish final outbound (`is_final=True`)
+> 6. Slack/Discord chat: `runs.wait()` → extract final response → publish outbound
+
+3. Add a bullet after the Feishu card-patching item (item 7):
+
+> 8. Telegram streaming: the "Working on it..." placeholder message is registered as the stream target; non-final updates `editMessageText` it in place (1s channel-side throttle, 4096-char truncation, 429 updates dropped); the final update performs the last edit and splits >4096 texts into follow-up messages
+
+(Renumber the following items accordingly.)
+
+- [ ] **Step 2: Check README mentions**
+
+Run: `grep -rn "Telegram" README.md docs/ --include="*.md" -l | head`
+If any doc states Telegram does not stream, update it the same way. If none, skip.
+
+- [ ] **Step 3: Run the full backend test suite**
+
+Run from `backend/`: `make test`
+Expected: all PASS.
+
+- [ ] **Step 4: Lint**
+
+Run from `backend/`: `make lint`
+Expected: clean.
+
+- [ ] **Step 5: Commit**
+
+```bash
+git add backend/CLAUDE.md README.md docs/
+git commit -m "docs: telegram channel now streams replies via message editing"
+```
+
+---
+
+## Self-Review Notes
+
+- **Spec coverage:** capability flip (Task 1), placeholder reuse (Task 2), throttle/truncate/429-drop/fallback-new-message (Task 4), final edit/split/cleanup/not-modified/RetryAfter-wait (Task 5), direct-send regression protection (Task 5 `test_final_without_stream_state_sends_plain_message` + existing `TestTelegramSendRetry`), docs (Task 6). Spec test list items 1-6 all map to concrete tests.
+- **Type consistency:** `_stream_messages: dict[str, dict[str, Any]]` keys `message_id`/`last_edit_at`/`last_text` used identically in Tasks 2, 4, 5. `_send_new_message(chat_id: int, chat_key: str, text: str)` signature consistent between Tasks 3 and 5.
+- **Known trade-off:** the final-path fallback `send_message` in `_finalize_stream_message` has no retry loop (single attempt, exception propagates to `_on_outbound` which logs and skips file uploads — same contract as today's `send()` failure).
@@ -0,0 +1,79 @@
+# Telegram 流式输出设计
+
+日期：2026-06-12
+分支：`feat/telegram-streaming`
+状态：已与用户确认
+
+## 背景与目标
+
+Telegram 通道目前完全不流式：`ChannelManager._handle_chat()` 走 `client.runs.wait()` 阻塞路径，agent 跑完后一次性 `send_message` 发出最终文本。用户先看到 "Working on it..."，然后长时间无反馈。
+
+目标：让 Telegram 与飞书行为一致——通过编辑同一条消息的方式流式展示所有 AI 文本增量（manager 现有流式管线产出的累积文本），最终以 `is_final=True` 的完整结果收尾。
+
+## 方案选型
+
+- **方案 A（采纳）**：channel 侧自适配。只改 `telegram.py` + `CHANNEL_CAPABILITIES` 一行，Telegram 通道自己做编辑节流与限速容错。不触碰飞书/微信/钉钉共享的 manager 流式代码路径。
+- 方案 B（否决）：manager 支持 per-channel `stream_min_interval` 节流。语义更统一，但改动共享路径，回归面大。
+
+## 改动 1 — `backend/app/channels/manager.py`
+
+`CHANNEL_CAPABILITIES["telegram"]["supports_streaming"]` 由 `False` 改为 `True`。
+
+生效后 manager 自动走 `_handle_streaming_chat()`：
+- 持续向 bus 发布 `is_final=False` 的 `OutboundMessage`（全量累积文本，manager 级节流 0.35s）；
+- 流结束（或出错）时必发一条 `is_final=True` 的完整结果（含 artifacts/attachments）。
+
+无其他 manager 改动。
+
+## 改动 2 — `backend/app/channels/telegram.py`
+
+### 流式状态
+
+- 新增 `self._stream_messages: dict[str, dict]`，key 为 `f"{chat_id}:{thread_ts}"`（`thread_ts` 是触发本轮对话的用户消息 id，inbound/outbound 全程透传）。
+- value 记录：`message_id`（正在被编辑的 bot 消息）、`last_edit_at`（节流时间戳）、`last_text`（已渲染文本，用于跳过无变化编辑）。
+
+### 占位消息复用
+
+`_send_running_reply()` 发出的 "Working on it..." 消息记录其 `message_id` 并登记到 `_stream_messages`。第一条流式更新直接编辑该占位消息。
+
+### `send()` 按 `is_final` 分流
+
+**`is_final=False`（流式更新）：**
+1. 节流：距同 key 上次成功编辑 < 1.0 秒（群聊 `chat_id` 为负数时为 3.0 秒，因 Telegram 群有 20 条/分钟上限）→ 直接丢弃本次更新（安全：每条更新都是全量文本，final 必达兜底）。
+2. 文本与 `last_text` 相同 → 跳过。
+3. 已登记流式消息 → `edit_message_text`；未登记（占位发送失败等）→ `send_message` 新建并登记。
+4. 文本 > 4096 字符 → 截断到 4095 并以 `…` 结尾后再编辑。
+
+**`is_final=True`（最终结果）：**
+1. 文本 ≤ 4096：对登记的流式消息做最终一次 `edit_message_text`。
+2. 文本 > 4096：第一段（4096 内）编辑流式消息，剩余按 4096 分段 `send_message` 补发。
+3. 清理该 key 的 `_stream_messages` 状态；用最后一条消息 id 更新 `_last_bot_message[chat_id]`（保持现有 threaded-reply 行为）。
+4. 无登记流式消息时退回现行 `send_message` 逻辑（含现有 3 次重试）。注意：命令回复与 `_send_error` 错误回复带有匹配的 `thread_ts` 且占位消息已登记，因此同样走「编辑占位消息」路径（有意的 UX 改进），而非直发新消息。
+
+### 错误处理
+
+- `telegram.error.RetryAfter`(429)：丢弃本次流式更新，不重试不等待（下次更新自带全量文本）；final 路径遇 429 则按 `retry_after` 等待后重试，保证最终结果送达。
+- `BadRequest: message is not modified`：静默忽略（final 文本与最后一帧相同时必然出现）。
+- 其他编辑失败（如消息被用户删除）：回退 `send_message` 发新消息并更新登记。
+
+### 不变项
+
+- 纯文本发送，不引入 `parse_mode`（无 Markdown 解析失败风险）。
+- `send_file()` 附件流程不动；attachments 仅随 final 消息到达，时序不变。
+- 非流式直发（无登记状态的 `is_final=True`）行为与现状完全一致。
+
+## 测试
+
+新增 Telegram 流式用例（参照 `tests/test_channels.py` 中飞书流式用例的 fake-bot 模式）：
+
+1. 多条 `is_final=False`：首条编辑占位消息，后续继续编辑同一 `message_id`。
+2. 1 秒内密集更新被节流丢弃；final 仍完整送达。
+3. final 超 4096：首段编辑 + 余段分段补发，`_last_bot_message` 指向最后一段。
+4. `message is not modified` 被静默忽略，不计为失败。
+5. 占位消息缺失时首条流式更新退化为 `send_message` 新建。
+6. 无流式状态的 `is_final=True` 直发路径行为不变（回归保护）。
+
+## 风险
+
+- Telegram 对单 chat 的编辑限速较严（约 1 次/秒）。1s channel 侧节流 + 429 丢帧策略是飞书 0.35s 间隔在 Telegram 上的等价物；最坏情况是中间帧丢失，最终完整性由 `is_final=True` 保证。
+- 群聊多话题并发：key 含 `thread_ts`，不同话题的流式互不串扰。