Files
deer-flow/scripts/detect_uv_extras.py
T
Yi Tang 48e038f752 feat(channels): enhance Discord with mention-only mode, thread routing, and typing indicators (#2842)
* feat(channels): enhance Discord with mention-only mode, thread routing, and typing indicators

Add mention_only config to only respond when bot is mentioned, with
allowed_channels override. Add thread_mode for Hermes-style auto-thread
creation. Add periodic typing indicators while bot is processing.

* fix(discord): include allowed_channels in mention_only skip condition (line 274)

* docs: fix Discord config example to match boolean thread_mode implementation

* style: format with ruff

* fix(discord): apply Copilot review fixes and resolve lint errors

- Remove unused Optional import
- Fix thread_ts type hints to str | None
- Fix has_mention logic for None values
- Implement thread_mode fallback to channel replies on thread creation failure
- Fix thread_mode docstring alignment
- Fix allowed_channels comment formatting in config.example.yaml

* fix(discord): reset context for orphaned threads in mention_only mode

When a message arrives in a thread not tracked by _active_threads,
clear thread_id and typing_target so the message falls through to
the standard channel handling pipeline, which creates a fresh thread
instead of incorrectly routing to the stale thread.

* fix(discord): create new thread on @ when channel has existing tracked thread

When mention_only is enabled and a user @-s the bot in a channel
that already has a tracked thread, create a new thread instead of
incorrectly routing to the old one.

* fix(discord): allow no-@ thread replies while skipping no-@ channel messages

The skip block for no-@ messages was too aggressive — it blocked
continuation replies within tracked threads AND incorrectly routed
no-@ channel messages to the existing thread.

Now:
- Thread message, no @ → routed to existing tracked thread
- Channel message, no @ → skipped
- Channel message, with @ → creates new thread

* feat(discord): add checkmark reaction to acknowledge received messages

* Move discord.py to optional dependency and auto-detect from config.yaml

- Add discord extra to [project.optional-dependencies] in pyproject.toml
- Update detect_uv_extras.py to map channels.discord.enabled: true -> --extra discord
- Set UV_EXTRAS=discord in docker-compose-dev.yaml gateway env

* fix(discord): persist thread-channel mappings to store for recovery after restart

Discord's _active_threads dict was purely in-memory, so all channel-to-thread
mappings were lost on server restart. This fix bridges ChannelStore into
DiscordChannel:

- Save thread mappings to store.json after every thread creation
- Restore active threads from store on DiscordChannel startup
- Pass channel_store to all channels via service.py config injection

Store keys follow the pattern: discord:<channel_id>:<thread_id>

* fix(discord): address Copilot review — fix types, typing targets, cross-thread safety, and config comments

* fix(tests): add multitask_strategy param to mock for clarification follow-up test

* fix(tests): explicitly set model_name=None for title middleware test isolation

* fix(discord): use trigger_typing() instead of typing() for typing indicators

discord.py 2.x TextChannel.typing() and Thread.typing() are async context
managers, not one-shot coroutines. Use trigger_typing() for periodic
typing indicator pings.

* fix(discord): cancel typing tasks on channel shutdown

Prevents 'Task was destroyed but it is pending' warnings when the
Discord client stops while typing indicator loops are still running.

* fix(scripts): detect nested YAML config for discord extra

section_value() only matched top-level YAML sections. Added
nested_section_value() that handles two-level nesting (e.g.,
channels.discord.enabled), so auto-detection of the discord
extra works when config uses the standard nested format.

* fix(docker): remove hard-coded UV_EXTRAS=discord from dev compose

Relies on auto-detection via detect_uv_extras.py instead of forcing
discord.py install even when channels.discord.enabled is false.
Matches production docker-compose.yaml behavior (UV_EXTRAS:-).

* refactor(nginx): move proxy_buffering/proxy_cache to server level

DRY cleanup — these directives were repeated in 14 location blocks.
Set at server level once, reducing duplication and risk of drift.

* fix(discord): use dedicated JSON file for thread persistence

Replace ChannelStore usage for Discord thread-ID persistence with a
dedicated discord_threads.json file. ChannelStore is designed to map
IM conversations to DeerFlow thread IDs — using it to persist Discord
thread IDs was semantically wrong and confusing.

Changes:
- _save_thread() now reads/writes a simple {channel_id: thread_id} JSON dict
- _load_active_threads() reads directly from the JSON file
- File path derived from ChannelStore directory (when available) or
  defaults to ~/.deer-flow/channels/discord_threads.json
- Removed unused ChannelStore import

* fix(discord): address WillemJiang's code review comments on PR #2842

1. Remove semantically incorrect message_in_thread variable. At this code
   point (after the Thread case is handled above), we're guaranteed to be in
   a channel, not a thread. Always apply mention_only check here.

2. Add _active_thread_ids reverse-lookup set for O(1) thread ID membership
   checks instead of O(n) scan of _active_threads.values(). Keep the set
   in sync with _active_threads in _load_active_threads() and _save_thread().

3. Add _thread_store_lock (threading.Lock) to protect _active_threads and
   the JSON file from concurrent access between the Discord loop thread
   (_run_client) and the main thread (_load_active_threads, _save_thread).
2026-05-15 22:30:05 +08:00

262 lines
8.2 KiB
Python
Executable File

#!/usr/bin/env python3
"""Resolve uv extras for local `uv sync` based on environment + config.yaml.
Order of resolution:
1. `UV_EXTRAS` env var. Comma- or whitespace-separated names so multiple
extras can be layered (e.g. ``UV_EXTRAS=postgres,ollama``). The same
parsing semantics apply in the Docker dev container via
``docker/dev-entrypoint.sh``. The Docker image-build path
(``backend/Dockerfile``) still treats `UV_EXTRAS` as a single token, so
``UV_EXTRAS=postgres,ollama`` would only install ``postgres,ollama`` as
one (invalid) extra at build time — author build-time values as a
single name.
2. Auto-detection from config.yaml — currently maps:
- database.backend == postgres -> postgres
- checkpointer.type == postgres -> postgres
Each extra name is validated against ``^[A-Za-z][A-Za-z0-9_-]*$`` (the same
shape uv enforces for `[project.optional-dependencies]` keys). Anything else
is dropped with a stderr warning so a stray shell metacharacter in `.env`
cannot reach the `uv sync` invocation downstream.
Output: space-separated `--extra <name>` flags ready for splat into
`uv sync`, e.g. `--extra postgres`. Empty output means "no extras".
Intentionally implemented with the standard library only: this script must run
*before* `uv sync` has populated the venv, so it cannot depend on PyYAML.
"""
from __future__ import annotations
import os
import re
import sys
from pathlib import Path
# Mirrors uv's accepted shape for extra names — keeps the eventual
# `uv sync --extra <name>` invocation free of shell metacharacters even when
# `UV_EXTRAS` comes from `.env` or another semi-trusted source.
_EXTRA_NAME_RE = re.compile(r"^[A-Za-z][A-Za-z0-9_-]*$")
def _validate_extras(names: list[str]) -> list[str]:
valid: list[str] = []
for name in names:
if _EXTRA_NAME_RE.match(name):
valid.append(name)
else:
print(
f"detect_uv_extras: ignoring invalid UV_EXTRAS entry {name!r} (must match [A-Za-z][A-Za-z0-9_-]*)",
file=sys.stderr,
)
return valid
def parse_env_extras(value: str) -> list[str]:
"""Split UV_EXTRAS into a list, accepting comma or whitespace separators."""
parts = re.split(r"[\s,]+", value.strip())
return _validate_extras([p for p in parts if p])
def find_config_file() -> Path | None:
"""Locate config.yaml using the same precedence as serve.sh."""
explicit = os.environ.get("DEER_FLOW_CONFIG_PATH")
if explicit:
candidate = Path(explicit)
if candidate.is_file():
return candidate
for path in (Path("config.yaml"), Path("backend/config.yaml")):
if path.is_file():
return path
return None
_SECTION_RE = re.compile(r"^([A-Za-z_][\w-]*)\s*:\s*$")
_INDENTED_SECTION_RE = re.compile(r"^\s+([A-Za-z_][\w-]*)\s*:\s*$")
_KEY_RE = re.compile(r"^\s+([A-Za-z_][\w-]*)\s*:\s*(\S.*?)\s*$")
def _strip_comment(line: str) -> str:
"""Drop trailing `#` comments while preserving `#` inside quoted strings."""
in_quote: str | None = None
out: list[str] = []
for ch in line:
if in_quote is not None:
out.append(ch)
if ch == in_quote:
in_quote = None
continue
if ch in ("'", '"'):
in_quote = ch
out.append(ch)
elif ch == "#":
break
else:
out.append(ch)
return "".join(out).rstrip()
def _unquote(value: str) -> str:
if len(value) >= 2 and value[0] == value[-1] and value[0] in ("'", '"'):
return value[1:-1]
return value
def section_value(lines: list[str], section: str, key: str) -> str | None:
"""Return the value of `section.key` from a flat-ish YAML, or None.
Only handles the shallow shape DeerFlow uses for these settings:
database:
backend: postgres
Nested mappings deeper than the immediate child level are ignored on
purpose — that keeps this parser predictable without a full YAML stack.
"""
inside = False
child_indent: int | None = None
for raw in lines:
line = _strip_comment(raw)
if not line.strip():
continue
sect_match = _SECTION_RE.match(line)
if sect_match:
inside = sect_match.group(1) == section
child_indent = None
continue
if not inside:
continue
stripped = line.lstrip()
indent = len(line) - len(stripped)
if indent == 0:
inside = False
continue
if child_indent is None:
child_indent = indent
if indent < child_indent:
inside = False
continue
if indent != child_indent:
continue
key_match = _KEY_RE.match(line)
if key_match and key_match.group(1) == key:
return _unquote(key_match.group(2).strip())
return None
def nested_section_value(lines: list[str], section_path: str, key: str) -> str | None:
"""Return the value of a nested YAML key like ``channels.discord.enabled``.
Handles two levels of nesting:
channels:
discord:
enabled: true
"""
parts = section_path.split(".")
if len(parts) != 2:
return None
parent_section, child_section = parts
inside_parent = False
inside_child = False
parent_indent: int | None = None
child_indent: int | None = None
for raw in lines:
line = _strip_comment(raw)
if not line.strip():
continue
stripped = line.lstrip()
indent = len(line) - len(stripped)
# Top-level section match
sect_match = _SECTION_RE.match(line)
if sect_match:
if indent == 0:
inside_parent = sect_match.group(1) == parent_section
inside_child = False
parent_indent = None
child_indent = None
continue
if not inside_parent:
continue
# Track parent indent from first child
if parent_indent is None and indent > 0:
parent_indent = indent
# If indent goes back to 0, we left the parent section
if indent == 0:
inside_parent = False
inside_child = False
continue
# Check if we're at the parent's child level (subsection)
if parent_indent is not None and indent == parent_indent:
# This could be a subsection or a direct key of parent
sub_match = _INDENTED_SECTION_RE.match(line)
if sub_match and sub_match.group(1) == child_section:
inside_child = True
child_indent = None
continue
else:
inside_child = False
continue
if not inside_child:
continue
# We're inside the subsection — track child indent
if child_indent is None and indent > (parent_indent or 0):
child_indent = indent
if child_indent is not None and indent != child_indent:
continue
key_match = _KEY_RE.match(line)
if key_match and key_match.group(1) == key:
return _unquote(key_match.group(2).strip())
return None
def detect_from_config(path: Path) -> list[str]:
try:
text = path.read_text(encoding="utf-8", errors="replace")
except OSError:
return []
lines = text.splitlines()
extras: set[str] = set()
if (section_value(lines, "database", "backend") or "").lower() == "postgres":
extras.add("postgres")
if (section_value(lines, "checkpointer", "type") or "").lower() == "postgres":
extras.add("postgres")
if (nested_section_value(lines, "channels.discord", "enabled") or "").lower() == "true":
extras.add("discord")
return sorted(extras)
def resolve_extras() -> list[str]:
env = os.environ.get("UV_EXTRAS", "")
if env.strip():
return parse_env_extras(env)
config = find_config_file()
if config is None:
return []
return detect_from_config(config)
def format_flags(extras: list[str]) -> str:
return " ".join(f"--extra {e}" for e in extras)
def main() -> int:
extras = resolve_extras()
if extras:
sys.stdout.write(format_flags(extras))
return 0
if __name__ == "__main__":
sys.exit(main())