mirror of
https://github.com/bytedance/deer-flow.git
synced 2026-06-18 13:46:02 +00:00
* fix(frontend): keep workspace interactive when SSR auth probe cannot reach gateway (#3493) When the SSR auth probe at /api/v1/auth/me times out or fails, the workspace layout used to render a static fallback page without AuthProvider or QueryClientProvider, making logout and every other interaction non-functional until the gateway recovered. Render the normal WorkspaceContent in 'gateway_unavailable' mode instead, surfacing a polite offline banner that re-probes the gateway in the background and hides itself the moment refreshUser() returns an authenticated user. The probe is reentrancy-guarded so a slow gateway cannot pile up parallel /auth/me requests. Closes #3493 * fix(workspace): silent probe in offline banner to avoid /login redirect during gateway recovery (#3493) The banner previously delegated retry probes to AuthProvider.refreshUser(), which treats any 401 from /api/v1/auth/me as 'session expired' and force-redirects to /login. During gateway recovery, the first few requests may transiently return 401 before the gateway is fully ready, which would incorrectly kick the user out — defeating the purpose of the offline banner. Now the banner silently fetches /api/v1/auth/me itself and only delegates to refreshUser() on 200 OK. Non-200 responses (401 / 5xx / network) are swallowed and retried on the next interval tick, ensuring the user stays logged in across short gateway outages. Verified in Docker: - docker pause deer-flow-gateway → banner appears, page interactive - docker unpause deer-flow-gateway → banner auto-disappears within 10s, user remains on /workspace/chats/new with full session restored - All 117 unit tests pass * fix(workspace): fix banner polling leak and persistent 401 handling (#3493) - Stop polling immediately after user recovery: add user to effect dependencies, cleanup interval when user !== null - Handle persistent 401: trigger login redirect after 3 consecutive unauthorized responses - Extract decision logic to pure helper, add 8 unit tests covering all critical paths * fix(workspace): address CR feedback on gateway offline recovery (#3493) - gateway-offline-banner-helpers: decrement (not reset) auth-failure streak on transient outcomes so a flapping gateway (401 alternating with 5xx) still converges on session-expired - gateway-offline-banner: reuse probe response body to apply user directly via new AuthProvider.applyUser, halving the recovery burst against an already-struggling gateway - gateway-offline-banner: extract classifyProbe into helpers for unit testability; log probe failures via console.warn instead of swallowing - gateway-offline-fallback: new shared component used by both workspace and (auth) layouts so auth pages recover the same way the workspace does, fixing the lockup where bare static HTML had no AuthProvider - AuthProvider.logout: fall back to hard navigation when the gateway logout fetch fails, matching legacy form-POST behaviour and avoiding stale client state during outage - tests: extend gateway-offline-banner-helpers.test with flapping convergence and classifyProbe branch coverage (19 cases total)
This commit is contained in:
@@ -0,0 +1,87 @@
|
||||
export const OFFLINE_BANNER_RETRY_INTERVAL_MS = 10_000;
|
||||
|
||||
/**
|
||||
* Number of consecutive 401 responses before treating the session as
|
||||
* expired and delegating to AuthProvider.refreshUser() for /login redirect.
|
||||
*
|
||||
* Threshold > 1 absorbs transient 401s that may occur in the first few
|
||||
* milliseconds after a gateway becomes ready again, without indefinitely
|
||||
* masking a genuinely expired cookie.
|
||||
*/
|
||||
export const OFFLINE_BANNER_AUTH_FAILURE_THRESHOLD = 3;
|
||||
|
||||
import type { User } from "@/core/auth/types";
|
||||
|
||||
export function shouldShowOfflineBanner(
|
||||
user: User | null,
|
||||
gatewayUnavailable: boolean,
|
||||
): boolean {
|
||||
return gatewayUnavailable && user === null;
|
||||
}
|
||||
|
||||
/** Categorised outcome of a single /auth/me probe. */
|
||||
export type ProbeOutcome =
|
||||
| { kind: "ok"; user: User } // 2xx with parsed body
|
||||
| { kind: "unauthorized" } // 401
|
||||
| { kind: "transient" }; // 5xx, network, abort, malformed body, etc.
|
||||
|
||||
/** Next action the banner effect should take after a probe. */
|
||||
export type ProbeAction =
|
||||
| { type: "apply-user"; user: User }
|
||||
| { type: "delegate-refresh"; reason: "session-expired" }
|
||||
| { type: "noop"; nextFailureCount: number };
|
||||
|
||||
/**
|
||||
* Pure: classify an HTTP probe outcome into ProbeOutcome.
|
||||
*
|
||||
* Extracted from the banner effect so it can be unit-tested independently.
|
||||
* `parsedUser` is the JSON body of a 2xx response (or null if absent/malformed);
|
||||
* surfacing it through ProbeOutcome lets the caller apply it directly instead
|
||||
* of paying for a second /auth/me round-trip via refreshUser().
|
||||
*/
|
||||
export function classifyProbe(
|
||||
res: Response | null,
|
||||
errored: boolean,
|
||||
parsedUser: User | null = null,
|
||||
): ProbeOutcome {
|
||||
if (errored || res === null) return { kind: "transient" };
|
||||
if (res.ok && parsedUser !== null) return { kind: "ok", user: parsedUser };
|
||||
if (res.ok) return { kind: "transient" }; // 2xx but body unusable
|
||||
if (res.status === 401) return { kind: "unauthorized" };
|
||||
return { kind: "transient" };
|
||||
}
|
||||
|
||||
/**
|
||||
* Pure state machine for what to do after a probe lands.
|
||||
*
|
||||
* Inputs: how many consecutive 401s we've seen so far + the new outcome.
|
||||
* Outputs: either "apply the user body we just fetched", "delegate to
|
||||
* refreshUser() for /login redirect", or "do nothing, update counter".
|
||||
*
|
||||
* Transient outcomes (5xx / network / abort) decrement the auth-failure
|
||||
* streak by 1 (floored at 0) rather than resetting it. This prevents a
|
||||
* flapping gateway that alternates 401 ↔ 5xx from indefinitely masking a
|
||||
* genuinely expired session: the streak still converges on the threshold.
|
||||
*/
|
||||
export function decideProbeAction(
|
||||
consecutiveAuthFailures: number,
|
||||
outcome: ProbeOutcome,
|
||||
threshold: number = OFFLINE_BANNER_AUTH_FAILURE_THRESHOLD,
|
||||
): ProbeAction {
|
||||
if (outcome.kind === "ok") {
|
||||
return { type: "apply-user", user: outcome.user };
|
||||
}
|
||||
if (outcome.kind === "unauthorized") {
|
||||
const next = consecutiveAuthFailures + 1;
|
||||
if (next >= threshold) {
|
||||
return { type: "delegate-refresh", reason: "session-expired" };
|
||||
}
|
||||
return { type: "noop", nextFailureCount: next };
|
||||
}
|
||||
// transient: decrement rather than reset so a flapping gateway
|
||||
// (alternating 401 ↔ 5xx) still converges on session-expired.
|
||||
return {
|
||||
type: "noop",
|
||||
nextFailureCount: Math.max(0, consecutiveAuthFailures - 1),
|
||||
};
|
||||
}
|
||||
@@ -0,0 +1,130 @@
|
||||
"use client";
|
||||
|
||||
import { useEffect, useRef } from "react";
|
||||
|
||||
import { useAuth } from "@/core/auth/AuthProvider";
|
||||
import { userSchema, type User } from "@/core/auth/types";
|
||||
import { useI18n } from "@/core/i18n/hooks";
|
||||
|
||||
import {
|
||||
OFFLINE_BANNER_RETRY_INTERVAL_MS,
|
||||
classifyProbe,
|
||||
decideProbeAction,
|
||||
shouldShowOfflineBanner,
|
||||
} from "./gateway-offline-banner-helpers";
|
||||
|
||||
interface GatewayOfflineBannerProps {
|
||||
/**
|
||||
* True when the server-side auth probe at `/api/v1/auth/me` could not
|
||||
* reach the gateway. The banner stays mounted until a client-side probe
|
||||
* confirms the gateway is healthy and `user` becomes populated.
|
||||
*/
|
||||
gatewayUnavailable: boolean;
|
||||
}
|
||||
|
||||
export function GatewayOfflineBanner({
|
||||
gatewayUnavailable,
|
||||
}: GatewayOfflineBannerProps) {
|
||||
const { t } = useI18n();
|
||||
const { user, applyUser, refreshUser, logout } = useAuth();
|
||||
// Guard against piling up probe calls while the gateway is still slow.
|
||||
const inFlightRef = useRef(false);
|
||||
// Count consecutive 401s so we can distinguish "transient warm-up 401"
|
||||
// from "session actually expired" and avoid lying with the banner.
|
||||
const authFailuresRef = useRef(0);
|
||||
|
||||
useEffect(() => {
|
||||
if (!gatewayUnavailable) return;
|
||||
// Once AuthProvider has a user again the banner has served its
|
||||
// purpose; tear down the polling so we don't keep probing every 10s
|
||||
// for the entire lifetime of the page (gatewayUnavailable is a
|
||||
// server-rendered prop and stays true until a full reload).
|
||||
if (user !== null) return;
|
||||
|
||||
const probe = async () => {
|
||||
if (inFlightRef.current) return;
|
||||
inFlightRef.current = true;
|
||||
let res: Response | null = null;
|
||||
let errored = false;
|
||||
let parsedUser: User | null = null;
|
||||
try {
|
||||
res = await fetch("/api/v1/auth/me", {
|
||||
credentials: "include",
|
||||
cache: "no-store",
|
||||
});
|
||||
// Reuse the probe's own response body instead of triggering a
|
||||
// second /auth/me request via refreshUser() — halves the recovery
|
||||
// burst against an already-struggling gateway.
|
||||
if (res.ok) {
|
||||
try {
|
||||
const data = await res.json();
|
||||
const parsed = userSchema.safeParse(data);
|
||||
if (parsed.success) parsedUser = parsed.data;
|
||||
} catch (err) {
|
||||
console.warn(
|
||||
"[gateway-offline-banner] probe body parse failed:",
|
||||
err,
|
||||
);
|
||||
}
|
||||
}
|
||||
} catch (err) {
|
||||
console.warn("[gateway-offline-banner] probe failed:", err);
|
||||
errored = true;
|
||||
} finally {
|
||||
inFlightRef.current = false;
|
||||
}
|
||||
|
||||
const action = decideProbeAction(
|
||||
authFailuresRef.current,
|
||||
classifyProbe(res, errored, parsedUser),
|
||||
);
|
||||
|
||||
if (action.type === "apply-user") {
|
||||
authFailuresRef.current = 0;
|
||||
applyUser(action.user);
|
||||
return;
|
||||
}
|
||||
if (action.type === "delegate-refresh") {
|
||||
// Hand off to AuthProvider, which on 401 will /login-redirect.
|
||||
authFailuresRef.current = 0;
|
||||
await refreshUser();
|
||||
return;
|
||||
}
|
||||
authFailuresRef.current = action.nextFailureCount;
|
||||
};
|
||||
|
||||
void probe();
|
||||
const handle = window.setInterval(() => {
|
||||
void probe();
|
||||
}, OFFLINE_BANNER_RETRY_INTERVAL_MS);
|
||||
return () => {
|
||||
window.clearInterval(handle);
|
||||
};
|
||||
}, [gatewayUnavailable, user, applyUser, refreshUser]);
|
||||
|
||||
if (!shouldShowOfflineBanner(user, gatewayUnavailable)) {
|
||||
return null;
|
||||
}
|
||||
|
||||
return (
|
||||
<div
|
||||
role="status"
|
||||
aria-live="polite"
|
||||
className="bg-muted text-muted-foreground flex items-center justify-between gap-3 border-b px-4 py-2 text-sm"
|
||||
>
|
||||
<span>
|
||||
{t.workspace.gatewayUnavailable}{" "}
|
||||
{t.workspace.gatewayUnavailableRetrying}
|
||||
</span>
|
||||
<button
|
||||
type="button"
|
||||
onClick={() => {
|
||||
void logout();
|
||||
}}
|
||||
className="hover:bg-background rounded-md border px-3 py-1 text-xs"
|
||||
>
|
||||
{t.workspace.logout}
|
||||
</button>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
@@ -0,0 +1,36 @@
|
||||
"use client";
|
||||
|
||||
import { AuthProvider } from "@/core/auth/AuthProvider";
|
||||
|
||||
import { GatewayOfflineBanner } from "./gateway-offline-banner";
|
||||
|
||||
interface GatewayOfflineFallbackProps {
|
||||
/**
|
||||
* When true, this component renders its own banner. The workspace layout
|
||||
* sets this to false because WorkspaceContent already mounts the banner
|
||||
* inside its sidebar layout. The (auth) layout sets it to true because
|
||||
* its plain children have no banner of their own.
|
||||
*/
|
||||
renderBanner?: boolean;
|
||||
children?: React.ReactNode;
|
||||
}
|
||||
|
||||
/**
|
||||
* Shared fallback shown by both the workspace and (auth) layouts when the
|
||||
* server-side auth probe could not reach the gateway. Wraps the children
|
||||
* with an AuthProvider so the banner's probe / logout / refresh hooks work
|
||||
* — fixing the `(auth)/layout.tsx` lockup where the bare static HTML had
|
||||
* no AuthProvider / QueryClientProvider and the user could not recover
|
||||
* without a manual reload.
|
||||
*/
|
||||
export function GatewayOfflineFallback({
|
||||
renderBanner = false,
|
||||
children,
|
||||
}: GatewayOfflineFallbackProps) {
|
||||
return (
|
||||
<AuthProvider initialUser={null}>
|
||||
{renderBanner && <GatewayOfflineBanner gatewayUnavailable />}
|
||||
{children}
|
||||
</AuthProvider>
|
||||
);
|
||||
}
|
||||
Reference in New Issue
Block a user