3 Commits

Author SHA1 Message Date
Willem Jiang a814ab50b5 fix(skills): make security scanner JSON parsing robust for LLM output variations (#2987)
The moderation model's response was silently falling through to a
  conservative block when LLMs wrapped structured output in markdown
  code fences, added prose around the JSON, returned case-variant
  decisions (e.g. "Allow"), or included nested braces in the reason
  field. The greedy `\{.*\}` regex also over-matched on nested braces.

  - Rewrite _extract_json_object() with markdown fence stripping and
    brace-balanced string-aware extraction
  - Normalize decision field to lowercase for case-insensitive matching
  - Distinguish "model unavailable" from "unparseable output" in fallback
  - Strengthen system prompt to explicitly forbid code fences and prose
  - Add 15 tests covering all reported scenarios

  Fixes #2985
2026-05-17 08:59:42 +08:00
Airene Fang 11f557a2c6 feat(trace):Add run_name to the trace info for system agents. (#2492)
* feat(trace): Add `run_name` to the trace info for suggestions and memory.

before(in langsmith):
CodexChatModel
CodexChatModel
lead_agent
after:
suggest_agent
memory_agent
lead_agent

feat(trace): Add `run_name` to the trace info for suggestions and memory.

before(in langsmith):
CodexChatModel
CodexChatModel
lead_agent
after:
suggest_agent
memory_agent
lead_agent

* feat(trace): Add `run_name` to the trace info for system agents.

before(in langsmith):
CodexChatModel
CodexChatModel
CodexChatModel
CodexChatModel
lead_agent
after:
suggest_agent
title_agent
security_agent
memory_agent
lead_agent

* chore(code format):code format

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-24 17:06:55 +08:00
DanielWalnut 888f7bfb9d Implement skill self-evolution and skill_manage flow (#1874)
* chore: ignore .worktrees directory

* Add skill_manage self-evolution flow

* Fix CI regressions for skill_manage

* Address PR review feedback for skill evolution

* fix(skill-evolution): preserve history on delete

* fix(skill-evolution): tighten scanner fallbacks

* docs: add skill_manage e2e evidence screenshot

* fix(skill-manage): avoid blocking fs ops in session runtime

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-06 22:07:11 +08:00