a814ab50b5
The moderation model's response was silently falling through to a
conservative block when LLMs wrapped structured output in markdown
code fences, added prose around the JSON, returned case-variant
decisions (e.g. "Allow"), or included nested braces in the reason
field. The greedy `\{.*\}` regex also over-matched on nested braces.
- Rewrite _extract_json_object() with markdown fence stripping and
brace-balanced string-aware extraction
- Normalize decision field to lowercase for case-insensitive matching
- Distinguish "model unavailable" from "unparseable output" in fallback
- Strengthen system prompt to explicitly forbid code fences and prose
- Add 15 tests covering all reported scenarios
Fixes #2985