fix(tool-search): reliably hide deferred MCP schemas by removing the ContextVar (closures + graph state) (#3342)

* feat(tool-search): add hash-scoped promoted state to ThreadState * feat(tool-search): add immutable DeferredToolCatalog with stable hash * feat(tool-search): add build_deferred_tool_setup + Command-writing tool_search * refactor(tool-search): replace deferred-tool ContextVar with closures + graph state (#3272) Build the deferred catalog + tool_search tool per agent from the policy-filtered tool list (after skill allowed-tools), pass deferred_names + catalog_hash explicitly to DeferredToolFilterMiddleware and the prompt, and record promotions in ThreadState.promoted (scoped by catalog_hash) via a Command-returning tool_search. Removes DeferredToolRegistry and the _registry_var ContextVar so deferral no longer depends on build/execute sharing an async context. MCP tools are tagged with metadata[deerflow_mcp]; client.py assembles deferral the same way. Catalog is built AFTER tool-policy filtering (no policy-excluded tool can leak via tool_search) and assembly is fail-closed. Migrate tests off the deleted registry APIs; delete the obsolete ContextVar-based #2884 regression (re-covered by state-based tests in a follow-up). * test(tool-search): lock tool_search promotion into next model turn via graph state * test(tool-search): cross-context, policy-leak, fail-closed, #2884 isolation regressions * test(tool-search): align real-LLM e2e with closure-based deferred setup * docs: update DeferredToolFilterMiddleware description for closure+state design * style(tests): drop unused import in test_deferred_setup (ruff) * test(tool-search): harden merge_promoted + replace tautological catalog test From independent code review: - merge_promoted: use existing.get("catalog_hash") so a forward-incompatible or externally-injected persisted promoted dict triggers a replace instead of a KeyError crash; add regression test for the malformed-existing case. - test_deferred_catalog: replace the `== [] or True` tautology (a test that could never fail) with a deterministic invalid-regex->literal-fallback check (positive match on calc + negative empty match). - DeferredToolCatalog: comment why frozen-without-slots is required for the cached_property hash/names fields (adding slots=True would break them). * fix(tool-search): read tool_search.enabled from self._app_config in client DeerFlowClient._ensure_agent called get_app_config() directly to read tool_search.enabled, but the client already resolves and stores its config as self._app_config at construction (and uses it everywhere else). The bare call re-resolves config from disk at agent-build time, which raises FileNotFoundError in environments without a config.yaml (CI) — test_client.py's fixture only patches get_app_config during __init__, so the later call hit the real loader. Use self._app_config, matching the rest of the client. * test(tool-search): lock tool_search post-policy append ordering tool_search is appended after skill-allowlist filtering, so the allowlist can no longer deny it by name. Lock the intended contract: it only appears when allowed MCP tools survive the filter, and its catalog (derived from the already policy-filtered list) can never expose a denied tool. Addresses the ordering observation from the Copilot review on #3342.
2026-06-10 09:25:57 +00:00 · 2026-06-02 22:43:22 +08:00
parent 74e3e80cf6
commit d9f4724950
17 changed files with 768 additions and 1267 deletions
@@ -270,6 +270,7 @@ def _build_middlewares(
    custom_middlewares: list[AgentMiddleware] | None = None,
    *,
    app_config: AppConfig | None = None,
+    deferred_setup=None,
 ):
    """Build middleware chain based on runtime configuration.

@@ -318,11 +319,13 @@ def _build_middlewares(
    if model_config is not None and model_config.supports_vision:
        middlewares.append(ViewImageMiddleware())

-    # Add DeferredToolFilterMiddleware to hide deferred tool schemas from model binding
-    if resolved_app_config.tool_search.enabled:
+    # Hide deferred tool schemas from model binding until tool_search promotes them.
+    # The deferred set + catalog hash come from the build-time setup (assembled
+    # after tool-policy filtering); promotion is read from graph state.
+    if deferred_setup is not None and deferred_setup.deferred_names:
        from deerflow.agents.middlewares.deferred_tool_filter_middleware import DeferredToolFilterMiddleware

-        middlewares.append(DeferredToolFilterMiddleware())
+        middlewares.append(DeferredToolFilterMiddleware(deferred_setup.deferred_names, deferred_setup.catalog_hash))

    # Add SubagentLimitMiddleware to truncate excess parallel task calls
    subagent_enabled = cfg.get("subagent_enabled", False)
@@ -353,6 +356,23 @@ def _build_middlewares(
    return middlewares


+def _assemble_deferred(filtered_tools, *, enabled: bool):
+    """Build the final tool list + deferred setup from a policy-filtered list.
+
+    Call AFTER tool-policy filtering so the deferred catalog never exposes a
+    tool the agent is not allowed to use. Fail-closed: if tool_search is enabled
+    and MCP tools survived filtering but no deferred set was recovered, raise
+    rather than silently binding their full schemas to the model.
+    """
+    from deerflow.tools.builtins.tool_search import _is_mcp_tool, build_deferred_tool_setup
+
+    setup = build_deferred_tool_setup(filtered_tools, enabled=enabled)
+    if enabled and not setup.deferred_names and any(_is_mcp_tool(t) for t in filtered_tools):
+        raise RuntimeError("tool_search enabled and MCP tools survived policy filtering, but no deferred set was recovered — refusing to bind MCP schemas (fail-closed).")
+    final_tools = list(filtered_tools) + ([setup.tool_search_tool] if setup.tool_search_tool else [])
+    return final_tools, setup
+
+
 def _available_skill_names(agent_config, is_bootstrap: bool) -> set[str] | None:
    if is_bootstrap:
        return {"bootstrap"}
@@ -460,16 +480,19 @@ def _make_lead_agent(config: RunnableConfig, *, app_config: AppConfig):

    if is_bootstrap:
        # Special bootstrap agent with minimal prompt for initial custom agent creation flow
-        tools = get_available_tools(model_name=model_name, subagent_enabled=subagent_enabled, app_config=resolved_app_config) + [setup_agent]
+        raw_tools = get_available_tools(model_name=model_name, subagent_enabled=subagent_enabled, app_config=resolved_app_config) + [setup_agent]
+        filtered = filter_tools_by_skill_allowed_tools(raw_tools, skills_for_tool_policy)
+        final_tools, setup = _assemble_deferred(filtered, enabled=resolved_app_config.tool_search.enabled)
        return create_agent(
            model=create_chat_model(name=model_name, thinking_enabled=thinking_enabled, app_config=resolved_app_config, attach_tracing=False),
-            tools=filter_tools_by_skill_allowed_tools(tools, skills_for_tool_policy),
-            middleware=_build_middlewares(config, model_name=model_name, app_config=resolved_app_config),
+            tools=final_tools,
+            middleware=_build_middlewares(config, model_name=model_name, app_config=resolved_app_config, deferred_setup=setup),
            system_prompt=apply_prompt_template(
                subagent_enabled=subagent_enabled,
                max_concurrent_subagents=max_concurrent_subagents,
                available_skills=set(["bootstrap"]),
                app_config=resolved_app_config,
+                deferred_names=setup.deferred_names,
            ),
            state_schema=ThreadState,
        )
@@ -478,17 +501,20 @@ def _make_lead_agent(config: RunnableConfig, *, app_config: AppConfig):
    # The default agent (no agent_name) does not see this tool.
    extra_tools = [update_agent] if agent_name else []
    # Default lead agent (unchanged behavior)
-    tools = get_available_tools(model_name=model_name, groups=agent_config.tool_groups if agent_config else None, subagent_enabled=subagent_enabled, app_config=resolved_app_config)
+    raw_tools = get_available_tools(model_name=model_name, groups=agent_config.tool_groups if agent_config else None, subagent_enabled=subagent_enabled, app_config=resolved_app_config)
+    filtered = filter_tools_by_skill_allowed_tools(raw_tools + extra_tools, skills_for_tool_policy)
+    final_tools, setup = _assemble_deferred(filtered, enabled=resolved_app_config.tool_search.enabled)
    return create_agent(
        model=create_chat_model(name=model_name, thinking_enabled=thinking_enabled, reasoning_effort=reasoning_effort, app_config=resolved_app_config, attach_tracing=False),
-        tools=filter_tools_by_skill_allowed_tools(tools + extra_tools, skills_for_tool_policy),
-        middleware=_build_middlewares(config, model_name=model_name, agent_name=agent_name, app_config=resolved_app_config),
+        tools=final_tools,
+        middleware=_build_middlewares(config, model_name=model_name, agent_name=agent_name, app_config=resolved_app_config, deferred_setup=setup),
        system_prompt=apply_prompt_template(
            subagent_enabled=subagent_enabled,
            max_concurrent_subagents=max_concurrent_subagents,
            agent_name=agent_name,
            available_skills=set(agent_config.skills) if agent_config and agent_config.skills is not None else None,
            app_config=resolved_app_config,
+            deferred_names=setup.deferred_names,
        ),
        state_schema=ThreadState,
    )
@@ -684,33 +684,16 @@ Rules:
 """


-def get_deferred_tools_prompt_section(*, app_config: AppConfig | None = None) -> str:
-    """Generate <available-deferred-tools> block for the system prompt.
+def get_deferred_tools_prompt_section(*, deferred_names: frozenset[str] = frozenset()) -> str:
+    """Generate <available-deferred-tools> from an explicit deferred-name set.

-    Lists only deferred tool names so the agent knows what exists
-    and can use tool_search to load them.
-    Returns empty string when tool_search is disabled or no tools are deferred.
+    Lists only names so the agent knows what exists and can use tool_search to
+    load them. Returns empty string when there are no deferred tools. The set is
+    computed at agent build time (after tool-policy filtering) and passed in.
    """
-    from deerflow.tools.builtins.tool_search import get_deferred_registry
-
-    if app_config is None:
-        try:
-            from deerflow.config import get_app_config
-
-            config = get_app_config()
-        except Exception:
-            return ""
-    else:
-        config = app_config
-
-    if not config.tool_search.enabled:
+    if not deferred_names:
        return ""
-
-    registry = get_deferred_registry()
-    if not registry:
-        return ""
-
-    names = "\n".join(e.name for e in registry.entries)
+    names = "\n".join(sorted(deferred_names))
    return f"<available-deferred-tools>\n{names}\n</available-deferred-tools>"


@@ -772,6 +755,7 @@ def apply_prompt_template(
    agent_name: str | None = None,
    available_skills: set[str] | None = None,
    app_config: AppConfig | None = None,
+    deferred_names: frozenset[str] = frozenset(),
 ) -> str:
    # Include subagent section only if enabled (from runtime parameter)
    n = max_concurrent_subagents
@@ -799,7 +783,7 @@ def apply_prompt_template(
    skills_section = get_skills_prompt_section(available_skills, app_config=app_config)

    # Get deferred tools section (tool_search)
-    deferred_tools_section = get_deferred_tools_prompt_section(app_config=app_config)
+    deferred_tools_section = get_deferred_tools_prompt_section(deferred_names=deferred_names)

    # Build ACP agent section only if ACP agents are configured
    acp_section = _build_acp_section(app_config=app_config)
@@ -1,12 +1,15 @@
 """Middleware to filter deferred tool schemas from model binding.

-When tool_search is enabled, MCP tools are registered in the DeferredToolRegistry
-and passed to ToolNode for execution, but their schemas should NOT be sent to the
-LLM via bind_tools (that's the whole point of deferral — saving context tokens).
+When tool_search is enabled, MCP tools are still passed to ToolNode for
+execution, but their schemas must NOT be sent to the LLM via bind_tools until
+the model has discovered them via tool_search. This middleware removes the
+still-deferred tools from request.tools before model binding, and blocks tool
+calls to tools that have not been promoted yet.

-This middleware intercepts wrap_model_call and removes deferred tools from
-request.tools so that model.bind_tools only receives active tool schemas.
-The agent discovers deferred tools at runtime via the tool_search tool.
+The deferred name set and the catalog hash are injected at construction time
+(no ContextVar). Promotion state is read from graph state (``state["promoted"]``),
+scoped by catalog hash so a stale persisted promotion cannot expose a renamed
+or drifted tool.
 """

 import logging
@@ -24,47 +27,49 @@ logger = logging.getLogger(__name__)


 class DeferredToolFilterMiddleware(AgentMiddleware[AgentState]):
-    """Remove deferred tools from request.tools before model binding.
+    """Hide deferred tool schemas from the bound model until promoted.

    ToolNode still holds all tools (including deferred) for execution routing,
-    but the LLM only sees active tool schemas — deferred tools are discoverable
-    via tool_search at runtime.
+    but the LLM only sees active tool schemas plus tools that have already been
+    promoted (recorded in ``state["promoted"]`` under the current catalog hash).
    """

+    def __init__(self, deferred_names: frozenset[str], catalog_hash: str | None):
+        super().__init__()
+        self._deferred = deferred_names
+        self._catalog_hash = catalog_hash
+
+    def _promoted(self, state) -> set[str]:
+        promoted = (state or {}).get("promoted")
+        if promoted and promoted.get("catalog_hash") == self._catalog_hash:
+            return set(promoted.get("names") or [])
+        return set()
+
+    def _hidden(self, state) -> set[str]:
+        return set(self._deferred) - self._promoted(state)
+
    def _filter_tools(self, request: ModelRequest) -> ModelRequest:
-        from deerflow.tools.builtins.tool_search import get_deferred_registry
-
-        registry = get_deferred_registry()
-        if not registry:
+        if not self._deferred:
            return request
-
-        deferred_names = registry.deferred_names
-        active_tools = [t for t in request.tools if getattr(t, "name", None) not in deferred_names]
-
-        if len(active_tools) < len(request.tools):
-            logger.debug(f"Filtered {len(request.tools) - len(active_tools)} deferred tool schema(s) from model binding")
-
-        return request.override(tools=active_tools)
+        hide = self._hidden(request.state)
+        if not hide:
+            return request
+        active = [t for t in request.tools if getattr(t, "name", None) not in hide]
+        if len(active) < len(request.tools):
+            logger.debug("Filtered %d deferred tool schema(s) from model binding", len(request.tools) - len(active))
+        return request.override(tools=active)

    def _blocked_tool_message(self, request: ToolCallRequest) -> ToolMessage | None:
-        from deerflow.tools.builtins.tool_search import get_deferred_registry
-
-        registry = get_deferred_registry()
-        if not registry:
+        if not self._deferred:
            return None
-
-        tool_name = str(request.tool_call.get("name") or "")
-        if not tool_name:
+        name = str(request.tool_call.get("name") or "")
+        if not name or name not in self._hidden(request.state):
            return None
-
-        if not registry.contains(tool_name):
-            return None
-
        tool_call_id = str(request.tool_call.get("id") or "missing_tool_call_id")
        return ToolMessage(
-            content=(f"Error: Tool '{tool_name}' is deferred and has not been promoted yet. Call tool_search first to expose and promote this tool's schema, then retry."),
+            content=(f"Error: Tool '{name}' is deferred and has not been promoted yet. Call tool_search first to expose and promote this tool's schema, then retry."),
            tool_call_id=tool_call_id,
-            name=tool_name,
+            name=name,
            status="error",
        )

@@ -58,6 +58,32 @@ def merge_todos(existing: list | None, new: list | None) -> list | None:
    return new


+class PromotedTools(TypedDict):
+    catalog_hash: str
+    names: list[str]
+
+
+def merge_promoted(existing: PromotedTools | None, new: PromotedTools | None) -> PromotedTools | None:
+    """Reducer for deferred-tool promotions, scoped by catalog hash.
+
+    - new None/empty -> preserve existing (node didn't touch promotions).
+    - catalog_hash changed -> replace wholesale, dropping stale names (prevents a
+      persisted bare name from exposing a different tool after catalog drift).
+    - same catalog_hash -> union names, dedupe, preserve order.
+    """
+    if not new:
+        return existing
+    if existing is None or existing.get("catalog_hash") != new["catalog_hash"]:
+        return {
+            "catalog_hash": new["catalog_hash"],
+            "names": list(dict.fromkeys(new["names"])),
+        }
+    return {
+        "catalog_hash": existing["catalog_hash"],
+        "names": list(dict.fromkeys(existing["names"] + new["names"])),
+    }
+
+
 class ThreadState(AgentState):
    sandbox: NotRequired[SandboxState | None]
    thread_data: NotRequired[ThreadDataState | None]
@@ -66,3 +92,4 @@ class ThreadState(AgentState):
    todos: Annotated[list | None, merge_todos]
    uploaded_files: NotRequired[list[dict] | None]
    viewed_images: Annotated[dict[str, ViewedImageData], merge_viewed_images]  # image_path -> {base64, mime_type}
+    promoted: Annotated[PromotedTools | None, merge_promoted]
@@ -33,7 +33,7 @@ from langchain.agents.middleware import AgentMiddleware
 from langchain_core.messages import AIMessage, HumanMessage, SystemMessage, ToolMessage
 from langchain_core.runnables import RunnableConfig

-from deerflow.agents.lead_agent.agent import _build_middlewares
+from deerflow.agents.lead_agent.agent import _assemble_deferred, _build_middlewares
 from deerflow.agents.lead_agent.prompt import apply_prompt_template
 from deerflow.agents.thread_state import ThreadState
 from deerflow.config.agents_config import AGENT_NAME_PATTERN
@@ -237,19 +237,22 @@ class DeerFlowClient:
        subagent_enabled = cfg.get("subagent_enabled", False)
        max_concurrent_subagents = cfg.get("max_concurrent_subagents", 3)

+        tools = self._get_tools(model_name=model_name, subagent_enabled=subagent_enabled)
+        final_tools, deferred_setup = _assemble_deferred(tools, enabled=self._app_config.tool_search.enabled)
        kwargs: dict[str, Any] = {
            # attach_tracing=False because ``stream()`` injects tracing
            # callbacks at the graph invocation root so a single embedded run
            # produces one trace with correct session_id / user_id propagation.
            # Attaching them again on the model would emit duplicate spans.
            "model": create_chat_model(name=model_name, thinking_enabled=thinking_enabled, attach_tracing=False),
-            "tools": self._get_tools(model_name=model_name, subagent_enabled=subagent_enabled),
-            "middleware": _build_middlewares(config, model_name=model_name, agent_name=self._agent_name, custom_middlewares=self._middlewares),
+            "tools": final_tools,
+            "middleware": _build_middlewares(config, model_name=model_name, agent_name=self._agent_name, custom_middlewares=self._middlewares, deferred_setup=deferred_setup),
            "system_prompt": apply_prompt_template(
                subagent_enabled=subagent_enabled,
                max_concurrent_subagents=max_concurrent_subagents,
                agent_name=self._agent_name,
                available_skills=self._available_skills,
+                deferred_names=deferred_setup.deferred_names,
            ),
            "state_schema": ThreadState,
        }
@@ -1,202 +1,151 @@
 """Tool search — deferred tool discovery at runtime.

 Contains:
- DeferredToolRegistry: stores deferred tools and handles regex search
- tool_search: the LangChain tool the agent calls to discover deferred tools
+- DeferredToolCatalog: immutable, searchable catalog of deferred tools.
+- build_tool_search_tool: builds the `tool_search` tool as a closure over a
+  catalog; it records promotions into graph state via ``Command``.
+- build_deferred_tool_setup: assembles the catalog + tool from a
+  policy-filtered tool list (call AFTER tool-policy filtering).

 The agent sees deferred tool names in <available-deferred-tools> but cannot
-call them until it fetches their full schema via the tool_search tool.
-Source-agnostic: no mention of MCP or tool origin.
+call them until it fetches their full schema via the tool_search tool. The
+deferred set rides on a build-time closure and promotion lives in per-thread
+graph state — there is no ContextVar. Source-agnostic: a tool is "deferred"
+when it carries the ``deerflow_mcp`` metadata tag.
 """

-import contextvars
+import hashlib
 import json
 import logging
 import re
 from dataclasses import dataclass
+from functools import cached_property
+from typing import Annotated

 from langchain.tools import BaseTool
-from langchain_core.tools import tool
+from langchain_core.messages import ToolMessage
+from langchain_core.tools import InjectedToolCallId, tool
 from langchain_core.utils.function_calling import convert_to_openai_function
+from langgraph.types import Command

 logger = logging.getLogger(__name__)

 MAX_RESULTS = 5  # Max tools returned per search


-# ── Registry ──
+# ── Catalog ──


-@dataclass
-class DeferredToolEntry:
-    """Lightweight metadata for a deferred tool (no full schema in context)."""
+# NOTE: frozen=True without slots=True keeps __dict__, which is what lets the
+# @cached_property fields below cache (they write to instance.__dict__, bypassing
+# the frozen __setattr__). Do NOT add slots=True or hash/names break at runtime.
+@dataclass(frozen=True)
+class DeferredToolCatalog:
+    """Immutable catalog of deferred tools. Pure search, no mutation."""

-    name: str
-    description: str
-    tool: BaseTool  # Full tool object, returned only on search match
+    tools: tuple[BaseTool, ...]

+    @cached_property
+    def names(self) -> frozenset[str]:
+        return frozenset(t.name for t in self.tools)

-class DeferredToolRegistry:
-    """Registry of deferred tools, searchable by regex pattern."""
-
-    def __init__(self):
-        self._entries: list[DeferredToolEntry] = []
-
-    def register(self, tool: BaseTool) -> None:
-        self._entries.append(
-            DeferredToolEntry(
-                name=tool.name,
-                description=tool.description or "",
-                tool=tool,
-            )
-        )
-
-    def promote(self, names: set[str]) -> None:
-        """Remove tools from the deferred registry so they pass through the filter.
-
-        Called after tool_search returns a tool's schema — the LLM now knows
-        the full definition, so the DeferredToolFilterMiddleware should stop
-        stripping it from bind_tools on subsequent calls.
-        """
-        if not names:
-            return
-        before = len(self._entries)
-        self._entries = [e for e in self._entries if e.name not in names]
-        promoted = before - len(self._entries)
-        if promoted:
-            logger.debug(f"Promoted {promoted} tool(s) from deferred to active: {names}")
+    @cached_property
+    def hash(self) -> str:
+        canon = [{"name": t.name, "schema": convert_to_openai_function(t)} for t in sorted(self.tools, key=lambda t: t.name)]
+        blob = json.dumps(canon, sort_keys=True, ensure_ascii=False, default=str)
+        return hashlib.sha256(blob.encode("utf-8")).hexdigest()[:16]

    def search(self, query: str) -> list[BaseTool]:
-        """Search deferred tools by regex pattern against name + description.
-
-        Supports three query forms (aligned with Claude Code):
-          - "select:name1,name2" — exact name match
-          - "+keyword rest" — name must contain keyword, rank by rest
-          - "keyword query" — regex match against name + description
-
-        Returns:
-            List of matched BaseTool objects (up to MAX_RESULTS).
-        """
        if query.startswith("select:"):
-            names = {n.strip() for n in query[7:].split(",")}
-            return [e.tool for e in self._entries if e.name in names][:MAX_RESULTS]
+            wanted = {n.strip() for n in query[7:].split(",")}
+            return [t for t in self.tools if t.name in wanted][:MAX_RESULTS]

        if query.startswith("+"):
            parts = query[1:].split(None, 1)
            required = parts[0].lower()
-            candidates = [e for e in self._entries if required in e.name.lower()]
+            candidates = [t for t in self.tools if required in t.name.lower()]
            if len(parts) > 1:
-                candidates.sort(
-                    key=lambda e: _regex_score(parts[1], e),
-                    reverse=True,
-                )
-            return [e.tool for e in candidates][:MAX_RESULTS]
+                candidates.sort(key=lambda t: _catalog_regex_score(parts[1], t), reverse=True)
+            return candidates[:MAX_RESULTS]

-        # General regex search
        try:
            regex = re.compile(query, re.IGNORECASE)
        except re.error:
            regex = re.compile(re.escape(query), re.IGNORECASE)
-
-        scored = []
-        for entry in self._entries:
-            searchable = f"{entry.name} {entry.description}"
+        scored: list[tuple[int, BaseTool]] = []
+        for t in self.tools:
+            searchable = f"{t.name} {t.description or ''}"
            if regex.search(searchable):
-                score = 2 if regex.search(entry.name) else 1
-                scored.append((score, entry))
-
+                scored.append((2 if regex.search(t.name) else 1, t))
        scored.sort(key=lambda x: x[0], reverse=True)
-        return [entry.tool for _, entry in scored][:MAX_RESULTS]
-
-    @property
-    def entries(self) -> list[DeferredToolEntry]:
-        return list(self._entries)
-
-    @property
-    def deferred_names(self) -> set[str]:
-        """Names of tools that are still hidden from model binding."""
-        return {entry.name for entry in self._entries}
-
-    def contains(self, name: str) -> bool:
-        """Return whether *name* is still deferred."""
-        return any(entry.name == name for entry in self._entries)
-
-    def __len__(self) -> int:
-        return len(self._entries)
+        return [t for _, t in scored][:MAX_RESULTS]


-def _regex_score(pattern: str, entry: DeferredToolEntry) -> int:
+def _catalog_regex_score(pattern: str, t: BaseTool) -> int:
    try:
        regex = re.compile(pattern, re.IGNORECASE)
    except re.error:
        regex = re.compile(re.escape(pattern), re.IGNORECASE)
-    return len(regex.findall(f"{entry.name} {entry.description}"))
+    return len(regex.findall(f"{t.name} {t.description or ''}"))


-# ── Per-request registry (ContextVar) ──
-#
-# Using a ContextVar instead of a module-level global prevents concurrent
-# requests from clobbering each other's registry.  In asyncio-based LangGraph
-# each graph run executes in its own async context, so each request gets an
-# independent registry value.  For synchronous tools run via
-# loop.run_in_executor, Python copies the current context to the worker thread,
-# so the ContextVar value is correctly inherited there too.
-
-_registry_var: contextvars.ContextVar[DeferredToolRegistry | None] = contextvars.ContextVar("deferred_tool_registry", default=None)
+# ── Setup / tool ──


-def get_deferred_registry() -> DeferredToolRegistry | None:
-    return _registry_var.get()
+@dataclass(frozen=True)
+class DeferredToolSetup:
+    tool_search_tool: BaseTool | None
+    deferred_names: frozenset[str]
+    catalog_hash: str | None


-def set_deferred_registry(registry: DeferredToolRegistry) -> None:
-    _registry_var.set(registry)
+def _is_mcp_tool(t: BaseTool) -> bool:
+    return (getattr(t, "metadata", None) or {}).get("deerflow_mcp") is True


-def reset_deferred_registry() -> None:
-    """Reset the deferred registry for the current async context."""
-    _registry_var.set(None)
+def build_tool_search_tool(catalog: DeferredToolCatalog) -> BaseTool:
+    catalog_hash = catalog.hash
+
+    @tool
+    def tool_search(query: str, tool_call_id: Annotated[str, InjectedToolCallId]) -> Command:
+        """Fetches full schema definitions for deferred tools so they can be called.
+
+        Deferred tools appear by name in <available-deferred-tools> in the system
+        prompt. Until fetched, only the name is known. This tool matches a query
+        against the deferred tools and returns the matched tools complete schemas;
+        once returned, a tool becomes callable.
+
+        Query forms:
+          - "select:Read,Edit" -- fetch these exact tools by name
+          - "notebook jupyter" -- keyword search, up to max_results best matches
+          - "+slack send" -- require "slack" in the name, rank by remaining terms
+        """
+        matched = catalog.search(query)[:MAX_RESULTS]
+        if not matched:
+            content, names = f"No tools found matching: {query}", []
+        else:
+            content = json.dumps([convert_to_openai_function(t) for t in matched], indent=2, ensure_ascii=False)
+            names = [t.name for t in matched]
+        return Command(
+            update={
+                "promoted": {"catalog_hash": catalog_hash, "names": names},
+                "messages": [ToolMessage(content=content, tool_call_id=tool_call_id, name="tool_search")],
+            }
+        )
+
+    return tool_search


-# ── Tool ──
+def build_deferred_tool_setup(filtered_tools: list[BaseTool], *, enabled: bool) -> DeferredToolSetup:
+    """Build the deferred-tool setup from a POLICY-FILTERED tool list.

-
-@tool
-def tool_search(query: str) -> str:
-    """Fetches full schema definitions for deferred tools so they can be called.
-
-    Deferred tools appear by name in <available-deferred-tools> in the system
-    prompt. Until fetched, only the name is known — there is no parameter
-    schema, so the tool cannot be invoked. This tool takes a query, matches
-    it against the deferred tool list, and returns the matched tools' complete
-    definitions. Once a tool's schema appears in that result, it is callable.
-
-    Query forms:
-      - "select:Read,Edit,Grep" — fetch these exact tools by name
-      - "notebook jupyter" — keyword search, up to max_results best matches
-      - "+slack send" — require "slack" in the name, rank by remaining terms
-
-    Args:
-        query: Query to find deferred tools. Use "select:<tool_name>" for
-               direct selection, or keywords to search.
-
-    Returns:
-        Matched tool definitions as JSON array.
+    Must be called after skill/agent tool-policy filtering so the catalog never
+    exposes a tool the current agent is not allowed to use.
    """
-    registry = get_deferred_registry()
-    if not registry:
-        return "No deferred tools available."
-
-    matched_tools = registry.search(query)
-    if not matched_tools:
-        return f"No tools found matching: {query}"
-
-    # Use LangChain's built-in serialization to produce OpenAI function format.
-    # This is model-agnostic: all LLMs understand this standard schema.
-    tool_defs = [convert_to_openai_function(t) for t in matched_tools[:MAX_RESULTS]]
-
-    # Promote matched tools so the DeferredToolFilterMiddleware stops filtering
-    # them from bind_tools — the LLM now has the full schema and can invoke them.
-    registry.promote({t.name for t in matched_tools[:MAX_RESULTS]})
-
-    return json.dumps(tool_defs, indent=2, ensure_ascii=False)
+    if not enabled:
+        return DeferredToolSetup(None, frozenset(), None)
+    deferred = [t for t in filtered_tools if _is_mcp_tool(t)]
+    if not deferred:
+        return DeferredToolSetup(None, frozenset(), None)
+    catalog = DeferredToolCatalog(tuple(deferred))
+    return DeferredToolSetup(build_tool_search_tool(catalog), catalog.names, catalog.hash)
@@ -7,7 +7,6 @@ from deerflow.config.app_config import AppConfig
 from deerflow.reflection import resolve_variable
 from deerflow.sandbox.security import is_host_bash_allowed
 from deerflow.tools.builtins import ask_clarification_tool, present_file_tool, task_tool, view_image_tool
-from deerflow.tools.builtins.tool_search import get_deferred_registry
 from deerflow.tools.sync import make_sync_tool_wrapper

 logger = logging.getLogger(__name__)
@@ -127,57 +126,13 @@ def get_available_tools(
                if mcp_tools:
                    logger.info(f"Using {len(mcp_tools)} cached MCP tool(s)")

-                    # When tool_search is enabled, register MCP tools in the
-                    # deferred registry and add tool_search to builtin tools.
-                    if config.tool_search.enabled:
-                        from deerflow.tools.builtins.tool_search import DeferredToolRegistry, set_deferred_registry
-                        from deerflow.tools.builtins.tool_search import tool_search as tool_search_tool
-
-                        # Reuse the existing registry if one is already set for
-                        # this async context. ``get_available_tools`` is
-                        # re-entered whenever a subagent is spawned
-                        # (``task_tool`` calls it to build the child agent's
-                        # toolset), and previously we used to unconditionally
-                        # rebuild the registry — wiping out the parent agent's
-                        # tool_search promotions. The
-                        # ``DeferredToolFilterMiddleware`` then re-hid those
-                        # tools from subsequent model calls, leaving the agent
-                        # able to see a tool's name but unable to invoke it
-                        # (issue #2884). ``contextvars`` already gives us the
-                        # lifetime semantics we want: a fresh request / graph
-                        # run starts in a new asyncio task with the
-                        # ContextVar at its default of ``None``, so reuse is
-                        # only triggered for re-entrant calls inside one run.
-                        #
-                        # Intentionally NOT reconciling against the current
-                        # ``mcp_tools`` snapshot. The MCP cache only refreshes
-                        # on ``extensions_config.json`` mtime changes, which
-                        # in practice happens between graph runs — not inside
-                        # one. And even if a refresh did happen mid-run, the
-                        # already-built lead agent's ``ToolNode`` still holds
-                        # the *previous* tool set (LangGraph binds tools at
-                        # graph construction time), so a brand-new MCP tool
-                        # couldn't actually be invoked anyway. The
-                        # ``DeferredToolRegistry`` doesn't retain the names
-                        # of previously-promoted tools (``promote()`` drops
-                        # the entry entirely), so re-syncing the registry
-                        # against a fresh ``mcp_tools`` list would
-                        # mis-classify those promotions as new tools and
-                        # re-register them as deferred — exactly the bug
-                        # this fix exists to prevent.
-                        existing_registry = get_deferred_registry()
-                        if existing_registry is None:
-                            registry = DeferredToolRegistry()
-                            for t in mcp_tools:
-                                registry.register(t)
-                            set_deferred_registry(registry)
-                            logger.info(f"Tool search active: {len(mcp_tools)} tools deferred")
-                        else:
-                            mcp_tool_names = {t.name for t in mcp_tools}
-                            still_deferred = len(existing_registry)
-                            promoted_count = max(0, len(mcp_tool_names) - still_deferred)
-                            logger.info(f"Tool search active (preserved promotions): {still_deferred} tools deferred, {promoted_count} already promoted")
-                        builtin_tools.append(tool_search_tool)
+                    # Tag MCP-sourced tools so deferred-tool assembly (done at
+                    # the agent construction site, AFTER tool-policy filtering)
+                    # can identify them. No ContextVar / registry is built here;
+                    # the deferred catalog + tool_search tool are assembled per
+                    # agent from the policy-filtered tool list.
+                    for t in mcp_tools:
+                        t.metadata = {**(t.metadata or {}), "deerflow_mcp": True}
        except ImportError:
            logger.warning("MCP module not available. Install 'langchain-mcp-adapters' package to enable MCP tools.")
        except Exception as e: