feat(telegram): stream agent replies by editing the placeholder message in place (#3534)

* docs(spec): telegram streaming output design

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(plan): telegram streaming implementation plan

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(telegram): report streaming support for telegram channel

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(channels): use slack as the non-streaming sample channel in manager tests

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(telegram): register running-reply placeholder as stream target

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(telegram): pin last_edit_at sentinel in placeholder registration test

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* refactor(telegram): extract _send_new_message from send()

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(telegram): edit streamed message in place for non-final updates

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(telegram): finalize streamed message with overflow splitting

When is_final=True arrives and stream state exists, pop the state, edit
the streamed placeholder with the final text, split overflow into follow-up
send_message calls, update _last_bot_message, and clear stream state.
Falls back to _send_new_message when no stream state is registered.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(telegram): exercise the not-modified handler in final edit path

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs: telegram channel now streams replies via message editing

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(telegram): harden final-delivery path with guarded retry and chunk retries

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(channels): accept runtime 'messages' SSE event for streaming text accumulation

The embedded runtime (matching LangGraph Platform semantics) emits SSE
event name 'messages' for the requested 'messages-tuple' stream mode,
so the manager never accumulated token deltas and streaming channels
only updated from end-of-step 'values' snapshots — on Telegram this
looked like 'Working on it...' followed by the full answer in one block.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(telegram): widen stream-edit throttle to 3s in group chats

Telegram caps bots at 20 messages/minute per group, stricter than the
1 msg/s per-chat guideline. Groups have negative chat ids, so pick the
interval by sign.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(telegram): address review findings — thread fallback messages, bound stream registry, share stream-event constants

- Fallback/new stream messages now carry reply_to_message_id parsed from
  thread_ts so they stay nested under the user's message (finding 1)
- STREAM_MODES / MESSAGE_STREAM_EVENTS constants link the requested
  stream modes to the SSE event names they arrive under (finding 2)
- _register_stream_message bounds the in-flight registry at 256 entries,
  evicting oldest, guarding against leaks when a final never arrives (finding 4)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
DanielWalnut
2026-06-13 08:38:28 +08:00
committed by GitHub
parent 3475f7cdad
commit 839fa99237
6 changed files with 1557 additions and 23 deletions
+179 -6
View File
@@ -5,6 +5,7 @@ from __future__ import annotations
import asyncio
import logging
import threading
import time
from typing import Any
from app.channels.base import Channel
@@ -13,6 +14,18 @@ from app.channels.message_bus import InboundMessage, InboundMessageType, Message
logger = logging.getLogger(__name__)
TELEGRAM_MAX_MESSAGE_LENGTH = 4096
STREAM_EDIT_MIN_INTERVAL_SECONDS = 1.0
# Groups (negative chat_id) are capped at 20 messages/minute by Telegram,
# so stream edits there must pace well below the private-chat 1 msg/s guideline.
STREAM_EDIT_GROUP_MIN_INTERVAL_SECONDS = 3.0
# Bound on tracked in-flight streamed messages; entries normally clear on the
# final update, this only guards against leaks when a final never arrives.
MAX_TRACKED_STREAM_MESSAGES = 256
# Indirection so tests can patch the clock without touching the global time module.
_monotonic = time.monotonic
class TelegramChannel(Channel):
"""Telegram bot channel using long-polling.
@@ -36,8 +49,15 @@ class TelegramChannel(Channel):
pass
# chat_id -> last sent message_id for threaded replies
self._last_bot_message: dict[str, int] = {}
# stream_key ("chat_id:thread_ts") -> state of the in-flight streamed
# bot message being edited in place: {"message_id", "last_edit_at", "last_text"}
self._stream_messages: dict[str, dict[str, Any]] = {}
self._connection_repo = config.get("connection_repo")
@property
def supports_streaming(self) -> bool:
return True
async def start(self) -> None:
if self._running:
return
@@ -104,10 +124,117 @@ class TelegramChannel(Channel):
logger.error("Invalid Telegram chat_id: %s", msg.chat_id)
return
kwargs: dict[str, Any] = {"chat_id": chat_id, "text": msg.text}
key = self._stream_key(msg.chat_id, msg.thread_ts)
if not msg.is_final:
await self._send_stream_update(chat_id, key, msg.text, reply_to=self._parse_message_id(msg.thread_ts))
return
state = self._stream_messages.pop(key, None)
if state is not None:
await self._finalize_stream_message(chat_id, msg.chat_id, state, msg.text)
return
await self._send_new_message(chat_id, msg.chat_id, msg.text, _max_retries=_max_retries)
async def _send_stream_update(self, chat_id: int, key: str, text: str, reply_to: int | None = None) -> None:
"""Edit the in-flight streamed message with accumulated text.
Updates are best-effort: throttled, rate-limit drops are silent. The
manager always publishes a final message afterwards, which guarantees
delivery of the complete text.
"""
if not text:
return
display = text
if len(display) > TELEGRAM_MAX_MESSAGE_LENGTH:
display = display[: TELEGRAM_MAX_MESSAGE_LENGTH - 1] + ""
bot = self._application.bot
state = self._stream_messages.get(key)
send_kwargs: dict[str, Any] = {"chat_id": chat_id, "text": display}
if reply_to:
send_kwargs["reply_to_message_id"] = reply_to
if state is None:
try:
sent = await bot.send_message(**send_kwargs)
except Exception:
logger.exception("[Telegram] failed to start stream message in chat=%s", chat_id)
return
self._register_stream_message(key, message_id=sent.message_id, last_text=display, last_edit_at=_monotonic())
return
now = _monotonic()
min_interval = STREAM_EDIT_GROUP_MIN_INTERVAL_SECONDS if chat_id < 0 else STREAM_EDIT_MIN_INTERVAL_SECONDS
if now - state["last_edit_at"] < min_interval:
return
if display == state["last_text"]:
return
try:
await bot.edit_message_text(chat_id=chat_id, message_id=state["message_id"], text=display)
except Exception as exc:
if self._is_not_modified(exc):
state["last_text"] = display
return
if self._is_retry_after(exc):
logger.debug("[Telegram] stream edit rate-limited in chat=%s, dropping update", chat_id)
return
logger.warning("[Telegram] stream edit failed in chat=%s, sending new message: %s", chat_id, exc)
try:
sent = await bot.send_message(**send_kwargs)
except Exception:
logger.exception("[Telegram] failed to send fallback stream message in chat=%s", chat_id)
return
state["message_id"] = sent.message_id
state["last_edit_at"] = _monotonic()
state["last_text"] = display
async def _finalize_stream_message(self, chat_id: int, chat_key: str, state: dict[str, Any], text: str) -> None:
"""Apply the final text: edit the streamed message, splitting overflow into follow-ups."""
bot = self._application.bot
chunks = self._split_message(text or "")
edited = True
if chunks[0] != state["last_text"]:
edited = await self._edit_final_chunk(bot, chat_id, state["message_id"], chunks[0])
if edited:
self._last_bot_message[chat_key] = state["message_id"]
else:
# Edit could not be applied (e.g. message deleted) — deliver the
# first chunk as a fresh message with the standard retry policy.
await self._send_new_message(chat_id, chat_key, chunks[0])
for chunk in chunks[1:]:
await self._send_new_message(chat_id, chat_key, chunk)
async def _edit_final_chunk(self, bot, chat_id: int, message_id: int, text: str) -> bool:
"""Edit with one rate-limit retry. Returns False if the edit could not be applied."""
for attempt in range(2):
try:
await bot.edit_message_text(chat_id=chat_id, message_id=message_id, text=text)
return True
except Exception as exc:
if self._is_not_modified(exc):
return True
if self._is_retry_after(exc) and attempt == 0:
await asyncio.sleep(self._retry_after_seconds(exc))
continue
logger.warning("[Telegram] final edit failed in chat=%s: %s", chat_id, exc)
return False
return False
async def _send_new_message(self, chat_id: int, chat_key: str, text: str, *, _max_retries: int = 3) -> int | None:
"""Send a fresh message with retry/backoff. Returns the sent message_id."""
kwargs: dict[str, Any] = {"chat_id": chat_id, "text": text}
# Reply to the last bot message in this chat for threading
reply_to = self._last_bot_message.get(msg.chat_id)
reply_to = self._last_bot_message.get(chat_key)
if reply_to:
kwargs["reply_to_message_id"] = reply_to
@@ -116,8 +243,8 @@ class TelegramChannel(Channel):
for attempt in range(_max_retries):
try:
sent = await bot.send_message(**kwargs)
self._last_bot_message[msg.chat_id] = sent.message_id
return
self._last_bot_message[chat_key] = sent.message_id
return sent.message_id
except Exception as exc:
last_exc = exc
if attempt < _max_retries - 1:
@@ -180,17 +307,63 @@ class TelegramChannel(Channel):
# -- helpers -----------------------------------------------------------
@staticmethod
def _stream_key(chat_id: str, thread_ts: str | None) -> str:
return f"{chat_id}:{thread_ts or ''}"
@staticmethod
def _parse_message_id(value: str | None) -> int | None:
try:
return int(value) if value else None
except (TypeError, ValueError):
return None
def _register_stream_message(self, key: str, *, message_id: int, last_text: str, last_edit_at: float) -> None:
self._stream_messages.pop(key, None)
while len(self._stream_messages) >= MAX_TRACKED_STREAM_MESSAGES:
self._stream_messages.pop(next(iter(self._stream_messages)))
self._stream_messages[key] = {
"message_id": message_id,
"last_edit_at": last_edit_at,
"last_text": last_text,
}
@staticmethod
def _is_retry_after(exc: Exception) -> bool:
return getattr(exc, "retry_after", None) is not None
@staticmethod
def _retry_after_seconds(exc: Exception) -> float:
value = getattr(exc, "retry_after", 0)
if hasattr(value, "total_seconds"):
return float(value.total_seconds())
return float(value)
@staticmethod
def _is_not_modified(exc: Exception) -> bool:
return "message is not modified" in str(exc).lower()
@staticmethod
def _split_message(text: str) -> list[str]:
return [text[i : i + TELEGRAM_MAX_MESSAGE_LENGTH] for i in range(0, len(text), TELEGRAM_MAX_MESSAGE_LENGTH)] or [text]
async def _send_running_reply(self, chat_id: str, reply_to_message_id: int) -> None:
"""Send a 'Working on it...' reply to the user's message."""
"""Send a 'Working on it...' reply and register it as the stream target."""
if not self._application:
return
try:
bot = self._application.bot
await bot.send_message(
sent = await bot.send_message(
chat_id=int(chat_id),
text="Working on it...",
reply_to_message_id=reply_to_message_id,
)
self._register_stream_message(
self._stream_key(chat_id, str(reply_to_message_id)),
message_id=sent.message_id,
last_text="Working on it...",
last_edit_at=0.0,
)
logger.info("[Telegram] 'Working on it...' reply sent in chat=%s", chat_id)
except Exception:
logger.exception("[Telegram] failed to send running reply in chat=%s", chat_id)