mirror of
https://github.com/bytedance/deer-flow.git
synced 2026-05-20 15:11:09 +00:00
`make dev` ran `uv sync` unconditionally on every restart, wiping any
optional extras the user had installed manually with
`uv sync --all-packages --extra postgres`. The Docker image-build path
already solved this via the `UV_EXTRAS` build-arg in backend/Dockerfile;
the local serve.sh path and the docker-compose-dev startup command
were the remaining outliers.
`scripts/serve.sh` now resolves extras before `uv sync`:
1. honors `UV_EXTRAS` (parity with backend/Dockerfile and
docker/docker-compose.yaml — no new convention introduced);
2. falls back to parsing config.yaml — `database.backend: postgres`
or legacy `checkpointer.type: postgres` auto-pins
`--extra postgres`, so the common case needs zero extra config.
3. detector stderr is no longer suppressed, so whitelist warnings or
crashes surface to the dev terminal (review feedback).
Detection lives in `scripts/detect_uv_extras.py` (stdlib-only — has to
run before the venv exists). Extra names are validated against
`^[A-Za-z][A-Za-z0-9_-]*$` so a stray shell metacharacter in `.env`
cannot reach `uv sync` downstream (defense in depth).
`docker/docker-compose-dev.yaml`'s startup command is now extracted to
`docker/dev-entrypoint.sh` (review feedback — the inline command had
grown to a ~350-char one-liner). The script:
- parses comma/whitespace-separated UV_EXTRAS, applying the same
`^[A-Za-z][A-Za-z0-9_-]*$` whitelist as the local detector;
- emits one `--extra X` flag per token, so `UV_EXTRAS=postgres,ollama`
works in Docker dev too (harmonized with local — review feedback);
- calls `uv sync --all-packages` (PR #2584) so workspace member
extras (deerflow-harness's postgres extra) are installed;
- keeps the existing self-heal `(uv sync || (recreate venv && retry))`
branch;
- exposes `--print-extras` for dry-run testing.
The compose file mounts the script read-only at runtime, so script
edits take effect on `make docker-restart` without an image rebuild.
The `--no-sync` alternative (a separate suggestion in the issue thread)
was considered but rejected for dev paths because it would drop the
self-heal branch and the auto-pickup of new pyproject deps. `--no-sync`
is already in use for the production CMD (`backend/Dockerfile:101`)
where it's appropriate.
Updates the asyncpg-missing error message to include the
`--all-packages` flag (matching #2584) plus the persistent install flow,
and expands `config.example.yaml` so all three install paths
(local / docker dev / docker image build) are documented with their
multi-extra capabilities.
Tests:
- `tests/test_detect_uv_extras.py` (21 tests) — local-path env parsing,
YAML edge cases, env-vs-config precedence, whitelist rejection of
shell metacharacters.
- `tests/test_dev_entrypoint.py` (15 tests) — docker-path validation
via `--print-extras`, multi-extra parsing, metacharacter abort.
- `tests/test_persistence_scaffold.py` (22 tests, unchanged) — passes
with the merged `--all-packages --extra postgres` error message.
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
This commit is contained in:
Executable
+180
@@ -0,0 +1,180 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Resolve uv extras for local `uv sync` based on environment + config.yaml.
|
||||
|
||||
Order of resolution:
|
||||
1. `UV_EXTRAS` env var. Comma- or whitespace-separated names so multiple
|
||||
extras can be layered (e.g. ``UV_EXTRAS=postgres,ollama``). The same
|
||||
parsing semantics apply in the Docker dev container via
|
||||
``docker/dev-entrypoint.sh``. The Docker image-build path
|
||||
(``backend/Dockerfile``) still treats `UV_EXTRAS` as a single token, so
|
||||
``UV_EXTRAS=postgres,ollama`` would only install ``postgres,ollama`` as
|
||||
one (invalid) extra at build time — author build-time values as a
|
||||
single name.
|
||||
2. Auto-detection from config.yaml — currently maps:
|
||||
- database.backend == postgres -> postgres
|
||||
- checkpointer.type == postgres -> postgres
|
||||
|
||||
Each extra name is validated against ``^[A-Za-z][A-Za-z0-9_-]*$`` (the same
|
||||
shape uv enforces for `[project.optional-dependencies]` keys). Anything else
|
||||
is dropped with a stderr warning so a stray shell metacharacter in `.env`
|
||||
cannot reach the `uv sync` invocation downstream.
|
||||
|
||||
Output: space-separated `--extra <name>` flags ready for splat into
|
||||
`uv sync`, e.g. `--extra postgres`. Empty output means "no extras".
|
||||
|
||||
Intentionally implemented with the standard library only: this script must run
|
||||
*before* `uv sync` has populated the venv, so it cannot depend on PyYAML.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Mirrors uv's accepted shape for extra names — keeps the eventual
|
||||
# `uv sync --extra <name>` invocation free of shell metacharacters even when
|
||||
# `UV_EXTRAS` comes from `.env` or another semi-trusted source.
|
||||
_EXTRA_NAME_RE = re.compile(r"^[A-Za-z][A-Za-z0-9_-]*$")
|
||||
|
||||
|
||||
def _validate_extras(names: list[str]) -> list[str]:
|
||||
valid: list[str] = []
|
||||
for name in names:
|
||||
if _EXTRA_NAME_RE.match(name):
|
||||
valid.append(name)
|
||||
else:
|
||||
print(
|
||||
f"detect_uv_extras: ignoring invalid UV_EXTRAS entry {name!r} (must match [A-Za-z][A-Za-z0-9_-]*)",
|
||||
file=sys.stderr,
|
||||
)
|
||||
return valid
|
||||
|
||||
|
||||
def parse_env_extras(value: str) -> list[str]:
|
||||
"""Split UV_EXTRAS into a list, accepting comma or whitespace separators."""
|
||||
parts = re.split(r"[\s,]+", value.strip())
|
||||
return _validate_extras([p for p in parts if p])
|
||||
|
||||
|
||||
def find_config_file() -> Path | None:
|
||||
"""Locate config.yaml using the same precedence as serve.sh."""
|
||||
explicit = os.environ.get("DEER_FLOW_CONFIG_PATH")
|
||||
if explicit:
|
||||
candidate = Path(explicit)
|
||||
if candidate.is_file():
|
||||
return candidate
|
||||
for path in (Path("config.yaml"), Path("backend/config.yaml")):
|
||||
if path.is_file():
|
||||
return path
|
||||
return None
|
||||
|
||||
|
||||
_SECTION_RE = re.compile(r"^([A-Za-z_][\w-]*)\s*:\s*$")
|
||||
_KEY_RE = re.compile(r"^\s+([A-Za-z_][\w-]*)\s*:\s*(\S.*?)\s*$")
|
||||
|
||||
|
||||
def _strip_comment(line: str) -> str:
|
||||
"""Drop trailing `#` comments while preserving `#` inside quoted strings."""
|
||||
in_quote: str | None = None
|
||||
out: list[str] = []
|
||||
for ch in line:
|
||||
if in_quote is not None:
|
||||
out.append(ch)
|
||||
if ch == in_quote:
|
||||
in_quote = None
|
||||
continue
|
||||
if ch in ("'", '"'):
|
||||
in_quote = ch
|
||||
out.append(ch)
|
||||
elif ch == "#":
|
||||
break
|
||||
else:
|
||||
out.append(ch)
|
||||
return "".join(out).rstrip()
|
||||
|
||||
|
||||
def _unquote(value: str) -> str:
|
||||
if len(value) >= 2 and value[0] == value[-1] and value[0] in ("'", '"'):
|
||||
return value[1:-1]
|
||||
return value
|
||||
|
||||
|
||||
def section_value(lines: list[str], section: str, key: str) -> str | None:
|
||||
"""Return the value of `section.key` from a flat-ish YAML, or None.
|
||||
|
||||
Only handles the shallow shape DeerFlow uses for these settings:
|
||||
database:
|
||||
backend: postgres
|
||||
Nested mappings deeper than the immediate child level are ignored on
|
||||
purpose — that keeps this parser predictable without a full YAML stack.
|
||||
"""
|
||||
inside = False
|
||||
child_indent: int | None = None
|
||||
for raw in lines:
|
||||
line = _strip_comment(raw)
|
||||
if not line.strip():
|
||||
continue
|
||||
sect_match = _SECTION_RE.match(line)
|
||||
if sect_match:
|
||||
inside = sect_match.group(1) == section
|
||||
child_indent = None
|
||||
continue
|
||||
if not inside:
|
||||
continue
|
||||
stripped = line.lstrip()
|
||||
indent = len(line) - len(stripped)
|
||||
if indent == 0:
|
||||
inside = False
|
||||
continue
|
||||
if child_indent is None:
|
||||
child_indent = indent
|
||||
if indent < child_indent:
|
||||
inside = False
|
||||
continue
|
||||
if indent != child_indent:
|
||||
continue
|
||||
key_match = _KEY_RE.match(line)
|
||||
if key_match and key_match.group(1) == key:
|
||||
return _unquote(key_match.group(2).strip())
|
||||
return None
|
||||
|
||||
|
||||
def detect_from_config(path: Path) -> list[str]:
|
||||
try:
|
||||
text = path.read_text(encoding="utf-8", errors="replace")
|
||||
except OSError:
|
||||
return []
|
||||
lines = text.splitlines()
|
||||
extras: set[str] = set()
|
||||
if (section_value(lines, "database", "backend") or "").lower() == "postgres":
|
||||
extras.add("postgres")
|
||||
if (section_value(lines, "checkpointer", "type") or "").lower() == "postgres":
|
||||
extras.add("postgres")
|
||||
return sorted(extras)
|
||||
|
||||
|
||||
def resolve_extras() -> list[str]:
|
||||
env = os.environ.get("UV_EXTRAS", "")
|
||||
if env.strip():
|
||||
return parse_env_extras(env)
|
||||
config = find_config_file()
|
||||
if config is None:
|
||||
return []
|
||||
return detect_from_config(config)
|
||||
|
||||
|
||||
def format_flags(extras: list[str]) -> str:
|
||||
return " ".join(f"--extra {e}" for e in extras)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
extras = resolve_extras()
|
||||
if extras:
|
||||
sys.stdout.write(format_flags(extras))
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
+33
-1
@@ -157,9 +157,41 @@ fi
|
||||
|
||||
# ── Install dependencies ────────────────────────────────────────────────────
|
||||
|
||||
# Pick a Python for the extras detector. Falls back to plain `python` for
|
||||
# Windows/Git Bash where only `python` is on PATH.
|
||||
if command -v python3 >/dev/null 2>&1; then
|
||||
DETECT_PYTHON="python3"
|
||||
elif command -v python >/dev/null 2>&1; then
|
||||
DETECT_PYTHON="python"
|
||||
else
|
||||
DETECT_PYTHON=""
|
||||
fi
|
||||
|
||||
# Resolve uv extras (postgres, etc.) from UV_EXTRAS or config.yaml so that
|
||||
# `uv sync` does not wipe out optional dependencies on every restart. See
|
||||
# scripts/detect_uv_extras.py and Issue #2754 for context. The detector
|
||||
# whitelists extra names against `^[A-Za-z][A-Za-z0-9_-]*$`, so the unquoted
|
||||
# splat below only sees valid uv argument tokens.
|
||||
#
|
||||
# Stderr is intentionally NOT redirected so the user sees:
|
||||
# - whitelist warnings (e.g. "ignoring invalid UV_EXTRAS entry ';'");
|
||||
# - detector crashes (e.g. unexpected Python error).
|
||||
# `|| true` keeps `set -e` from killing dev startup on a detector failure;
|
||||
# the result is just an empty UV_EXTRAS_FLAGS, which means "no extras".
|
||||
UV_EXTRAS_FLAGS=""
|
||||
if [ -n "$DETECT_PYTHON" ]; then
|
||||
UV_EXTRAS_FLAGS=$("$DETECT_PYTHON" "$REPO_ROOT/scripts/detect_uv_extras.py" || { echo "[serve.sh] detect_uv_extras.py failed (exit $?) — proceeding without extras" >&2; echo ""; })
|
||||
fi
|
||||
|
||||
if ! $SKIP_INSTALL; then
|
||||
echo "Syncing dependencies..."
|
||||
(cd backend && uv sync --quiet) || { echo "✗ Backend dependency install failed"; exit 1; }
|
||||
if [ -n "$UV_EXTRAS_FLAGS" ]; then
|
||||
echo " • uv extras: $UV_EXTRAS_FLAGS"
|
||||
fi
|
||||
# `--all-packages` propagates extras into workspace members (deerflow-harness
|
||||
# in particular). Required for postgres extras — see PR #2584.
|
||||
# Intentionally unquoted to splat multiple `--extra X` pairs.
|
||||
(cd backend && uv sync --quiet --all-packages $UV_EXTRAS_FLAGS) || { echo "✗ Backend dependency install failed"; exit 1; }
|
||||
(cd frontend && pnpm install --silent) || { echo "✗ Frontend dependency install failed"; exit 1; }
|
||||
echo "✓ Dependencies synced"
|
||||
else
|
||||
|
||||
Reference in New Issue
Block a user