fix(sandbox): return actionable hint when read_file hits a binary file (#3624)

read_file decodes with UTF-8. Binary uploads (.xlsx, images, ...) raise
UnicodeDecodeError in the sandbox layer. UnicodeDecodeError is a ValueError
subclass, not an OSError, so it bypassed the typed handlers and fell through
to the generic except, surfacing a vague "Unexpected error reading file"
message. The model could not tell the file was binary, so it retried
read_file instead of switching to bash + pandas/openpyxl, burning LLM
round-trips and bloating context with repeated failures.

Add a dedicated UnicodeDecodeError handler that tells the model the file is
binary and to use bash with a suitable library (or view_image for images).
This commit is contained in:
Xinmin Zeng
2026-06-17 21:11:44 +08:00
committed by GitHub
parent e732a741bf
commit 6a4a30fa2b
2 changed files with 71 additions and 0 deletions
@@ -1665,6 +1665,12 @@ def read_file_tool(
return f"Error: Permission denied reading file: {requested_path}"
except IsADirectoryError:
return f"Error: Path is a directory, not a file: {requested_path}"
except UnicodeDecodeError:
return (
f"Error: cannot read '{requested_path}' as text — it appears to be a binary file "
"(e.g. .xlsx, .pdf, or an image). read_file only supports UTF-8 text. Use bash with a "
"suitable library instead (pandas/openpyxl for spreadsheets), or view_image for images."
)
except Exception as e:
return f"Error: Unexpected error reading file: {_sanitize_error(e, runtime)}"