fix(sandbox): auto-restart crashed containers transparently (#2788)

When a sandbox container crashes (e.g. due to an internal error), the
  agent enters a connection-refused loop because AioSandboxProvider.get()
  returns a cached but dead sandbox object. Add a liveness check in get()
  that detects crashed containers via backend.is_alive() and evicts them
  from all caches, allowing ensure_sandbox_initialized() to transparently
  recreate a fresh container on the next acquire().

  The behavior is controlled by a new  config option
  (default: true). Set to false to skip health checks and preserve the
  old behavior of returning stale cached sandboxes.

  Closes #2788
This commit is contained in:
Willem Jiang
2026-05-10 22:53:58 +08:00
parent 94da8f67d7
commit b67c2a4e56
4 changed files with 217 additions and 1 deletions
+5
View File
@@ -601,6 +601,11 @@ sandbox:
# # Optional: Prefix for container names (default: deer-flow-sandbox)
# # container_prefix: deer-flow-sandbox
#
# # Optional: Automatically restart crashed sandbox containers (default: true)
# # When enabled, a dead container is detected on the next tool call and
# # transparently replaced with a fresh one. Set to false to disable.
# # auto_restart: true
#
# # Optional: Additional mount directories from host to container
# # NOTE: Skills directory is automatically mounted from skills.path to skills.container_path
# # mounts: