# Storage Package Design ## Background DeerFlow currently has several persistence responsibilities spread across app, gateway, runtime, and legacy persistence modules. This makes the persistence boundary difficult to reason about and creates several migration risks: - Routers and runtime services can accidentally depend on concrete persistence implementations instead of stable contracts. - User/auth, run metadata, thread metadata, feedback, run events, and checkpointer setup are initialized through different paths. - Some persistence behavior is duplicated between memory, SQLite, and PostgreSQL-oriented code paths. - Incremental migration is hard because app-level code and storage-level code are coupled. - Adding or validating another SQL backend requires touching app/runtime code instead of a storage-owned package. The storage package is introduced to make application data persistence a package-level capability with explicit contracts, a clear boundary, and SQL backend compatibility. ## Goals - Provide a standalone `packages/storage` package for durable application data. - Support SQLite, PostgreSQL, and MySQL through a shared persistence construction flow. - Keep LangGraph checkpointer initialization compatible with the same database backend. - Expose repository contracts as the only package-level data access boundary. - Let the app layer depend on app-owned adapters under `app.infra.storage`, not on storage DB implementation classes. - Allow the app/gateway migration to happen in small steps without forcing a large rewrite. ## Non-Goals - This design does not remove legacy persistence in the first PR. - This design does not move routers directly onto storage package models. - This design does not make app routers own SQLAlchemy sessions. - Cron persistence is intentionally out of scope for the storage package foundation. - Memory backend is not part of the durable storage package. Memory compatibility, if still needed by app runtime, belongs outside `packages/storage`. ## Storage Design Principles ### Package-Owned Durable Storage `packages/storage` owns durable application data persistence. It defines: - configuration shape for storage-backed persistence - SQLAlchemy models - repository contracts and DTOs - SQL repository implementations - persistence factory functions - compatibility helpers for config-driven initialization The package should be usable without importing `app.gateway`, routers, auth providers, or runtime-specific gateway objects. ### SQL Backend Compatibility The package supports three SQL backends: - SQLite for local/single-node deployments - PostgreSQL for production multi-node deployments - MySQL for deployments that standardize on MySQL Backend-specific differences are handled inside the storage package: - SQLAlchemy async engine URL construction - LangGraph checkpointer connection-string compatibility - JSON metadata filtering across SQLite/PostgreSQL/MySQL - SQL dialect behavior around locking, aggregation, and JSON type semantics ### Unified Persistence Bundle Storage initialization returns an `AppPersistence` bundle: ```python @dataclass(slots=True) class AppPersistence: checkpointer: Checkpointer engine: AsyncEngine session_factory: async_sessionmaker[AsyncSession] setup: Callable[[], Awaitable[None]] aclose: Callable[[], Awaitable[None]] ``` The app runtime can initialize persistence once, call `setup()`, and then inject: - `checkpointer` - `session_factory` - repository adapters This keeps checkpointer and application data aligned to the same backend without requiring routers to understand database configuration. ## Package Layout ```text backend/packages/storage/ store/ config/ storage_config.py app_config.py persistence/ factory.py types.py base_model.py json_compat.py drivers/ sqlite.py postgres.py mysql.py repositories/ contracts/ user.py run.py thread_meta.py feedback.py run_event.py models/ user.py run.py thread_meta.py feedback.py run_event.py db/ user.py run.py thread_meta.py feedback.py run_event.py factory.py ``` ## Persistence Construction The primary storage entrypoint is: ```python from store.persistence import create_persistence_from_storage_config persistence = await create_persistence_from_storage_config(storage_config) await persistence.setup() ``` For app-level compatibility with existing database config shape: ```python from store.persistence import create_persistence_from_database_config persistence = await create_persistence_from_database_config(config.database) await persistence.setup() ``` Expected app startup flow: ```python persistence = await create_persistence_from_database_config(config.database) await persistence.setup() app.state.persistence = persistence app.state.checkpointer = persistence.checkpointer app.state.session_factory = persistence.session_factory ``` Expected app shutdown flow: ```python await app.state.persistence.aclose() ``` ## Repository Contract Design Repository contracts are the storage package's public data access boundary. They live under `store.repositories.contracts` and are re-exported from `store.repositories`. The key contract groups are: - `UserRepositoryProtocol` - `RunRepositoryProtocol` - `ThreadMetaRepositoryProtocol` - `FeedbackRepositoryProtocol` - `RunEventRepositoryProtocol` Each contract owns: - input DTOs, such as `UserCreate`, `RunCreate`, `ThreadMetaCreate` - output DTOs, such as `User`, `Run`, `ThreadMeta` - repository protocol methods - domain-specific exceptions when needed, such as `InvalidMetadataFilterError` Repository construction is session-based: ```python from store.repositories import build_run_repository async with persistence.session_factory() as session: repo = build_run_repository(session) run = await repo.get_run(run_id) ``` This keeps transaction ownership explicit. The storage package does not hide commits or session lifecycle inside global singletons. ## App/Infra Calling Contract The app layer should not call `store.repositories.db.*` directly. The intended app boundary is `app.infra.storage`. `app.infra.storage` is responsible for: - receiving `session_factory` from FastAPI runtime initialization - owning session lifecycle for app-facing repository methods - translating storage DTOs to app/gateway DTOs only when needed - preserving the existing app-facing names during migration - depending on storage repository protocols, not concrete DB classes Expected adapter pattern: ```python class StorageRunRepository(RunRepositoryProtocol): def __init__(self, session_factory): self._session_factory = session_factory async def get_run(self, run_id: str): async with self._session_factory() as session: repo = build_run_repository(session) return await repo.get_run(run_id) ``` For gateway compatibility, app state can keep existing names while the implementation changes: ```python app.state.run_store = StorageRunStore(run_repository) app.state.feedback_repo = StorageFeedbackStore(feedback_repository) app.state.thread_store = StorageThreadMetaStore(thread_meta_repository) app.state.run_event_store = StorageRunEventStore(run_event_repository) app.state.checkpointer = persistence.checkpointer app.state.session_factory = persistence.session_factory ``` The app-facing objects may expose legacy method names during migration, but their internal data access should go through storage contracts. ## Boundary Rules ### Allowed Calls Storage package callers may use: ```python from store.persistence import create_persistence_from_database_config from store.persistence import create_persistence_from_storage_config from store.repositories import build_run_repository from store.repositories import build_user_repository from store.repositories import build_thread_meta_repository from store.repositories import build_feedback_repository from store.repositories import build_run_event_repository from store.repositories import RunRepositoryProtocol from store.repositories import UserRepositoryProtocol ``` App layer callers should use: ```python from app.infra.storage import StorageRunRepository from app.infra.storage import StorageUserDataRepository from app.infra.storage import StorageThreadMetaRepository from app.infra.storage import StorageFeedbackRepository from app.infra.storage import StorageRunEventRepository ``` ### Prohibited Calls App/gateway/router/auth code must not import: ```python from store.repositories.db import DbRunRepository from store.repositories.models import Run from store.persistence.base_model import MappedBase ``` Routers must not: - create SQLAlchemy engines - create SQLAlchemy sessions directly - call storage DB repository classes directly - commit/rollback storage transactions directly unless explicitly scoped by an infra adapter - depend on storage SQLAlchemy model classes Storage package code must not import: ```python import app.gateway import app.infra import deerflow.runtime ``` The dependency direction is: ```text app/gateway -> app.infra.storage -> packages/storage contracts/factories -> packages/storage db implementations ``` The reverse direction is forbidden. ## Checkpointer Compatibility The storage persistence bundle initializes the LangGraph checkpointer alongside application data persistence. Backend-specific notes: - SQLite uses `langgraph-checkpoint-sqlite`. - PostgreSQL uses `langgraph-checkpoint-postgres` and requires a string `postgresql://...` connection URL. - MySQL uses `langgraph-checkpoint-mysql` and requires a string MySQL connection URL. SQLAlchemy may use async driver URLs such as `postgresql+asyncpg://...` or `mysql+aiomysql://...`, but LangGraph checkpointer constructors expect plain string connection URLs. This conversion belongs inside the storage driver implementation. ## JSON Metadata Filtering Thread metadata search supports dialect-aware JSON filtering through `store.persistence.json_compat`. The matcher supports: - `None` - `bool` - `int` - `float` - `str` It rejects: - unsafe keys - nested JSON path expressions - dict/list values - integers outside signed 64-bit range This prevents SQL/JSON path injection, avoids compiled-cache type drift, and preserves type semantics such as `True != 1` and explicit JSON `null` not matching a missing key. ## Step-by-Step Implementation Plan ### Step 1: Introduce Storage Package Foundation - Add `backend/packages/storage`. - Add storage config models. - Add `AppPersistence`. - Add SQLite/PostgreSQL/MySQL persistence drivers. - Add repository contracts, models, DB implementations, and factory helpers. - Add package dependency wiring. - Exclude cron persistence. ### Step 2: Harden Storage Backend Compatibility - Validate SQLite setup and repository behavior. - Validate PostgreSQL and MySQL with local E2E tests. - Fix checkpointer connection-string compatibility. - Fix PostgreSQL locking and aggregation differences. - Add dialect-aware JSON metadata filtering. ### Step 3: Add App Infra Adapters - Add `backend/app/infra/storage`. - Implement app-facing repositories that own session lifecycle. - Keep storage contracts as the only data access boundary. - Add legacy compatibility adapters for existing app/gateway method shapes. - Keep app/gateway imports out of `packages/storage`. ### Step 4: Switch FastAPI Runtime Injection - Initialize storage persistence in FastAPI startup/lifespan. - Attach `persistence`, `checkpointer`, and `session_factory` to `app.state`. - Preserve existing external state names: - `run_store` - `feedback_repo` - `thread_store` - `run_event_store` - `checkpointer` - `session_factory` - Start with user/auth provider construction, then migrate run/thread/feedback/run_event. ### Step 5: Router and Auth Compatibility - Ensure routers consume app-facing adapters, not storage DB classes. - Ensure auth providers depend on user repository contracts. - Keep router response shapes unchanged. - Add focused auth/admin/router regression tests. ### Step 6: Cleanup Legacy Persistence - Compare old persistence usage after app/gateway migration. - Remove unused old repository implementations only after all call sites move. - Keep compatibility shims only where needed for a transition window. - Delete memory backend paths from storage-owned durable persistence. ## Testing Strategy Unit tests should cover: - config parsing - persistence setup - table creation - repository CRUD/query behavior - typed JSON metadata filtering - dialect SQL compilation - cron exclusion E2E tests should cover: - SQLite persistence setup - PostgreSQL temporary database setup - MySQL temporary database setup - repository contract behavior across all supported SQL backends - JSON/Unicode round trip - rollback behavior - persistence close/cleanup E2E tests may remain local-only if CI does not provide PostgreSQL/MySQL services.