Files

T

KKK 16aa51c9b3 feat(skills): add systematic-literature-review skill for multi-paper SLR workflows (#2032 )

* feat(skills): add systematic-literature-review skill for multi-paper SLR workflows

Adds a new skill that produces a structured systematic literature review (SLR)
across multiple academic papers on a topic. Addresses #1862 with a pure skill
approach: no new tools, no architectural changes, no new dependencies.

Skill layout:
- SKILL.md — 4+1 phase workflow (plan, search, extract, synthesize, present)
- scripts/arxiv_search.py — arXiv API client, stdlib only, with a
  requests->urllib fallback shim modeled after github-deep-research's
  github_api.py
- templates/{apa,ieee,bibtex}.md — citation format templates selected
  dynamically in Phase 4, mirroring podcast-generation's templates/ pattern

Design notes:
- Multi-paper synthesis uses the existing `task` tool to dispatch extraction
  subagents in parallel. SKILL.md's Phase 3 includes a fixed decision table
  for batch splitting to respect the runtime's MAX_CONCURRENT_SUBAGENTS = 3
  cap, and explicitly tells the agent to strip the "Task Succeeded. Result: "
  prefix before parsing subagent JSON output.
- arXiv only, by design. Semantic Scholar and PubMed adapters would push the
  scope toward a standalone MCP server (see #933) and are intentionally out
  of scope for this skill.
- Coexists with the existing `academic-paper-review` skill: this skill does
  breadth-first synthesis across many papers, academic-paper-review does
  single-paper peer review. The two are routed via distinct triggers and
  can compose (SLR on many + deep review on 1-2 important ones).
- Hard upper bound of 50 papers, tied to the Phase 3 concurrency strategy.
  Larger surveys degrade in synthesis quality and are better split by
  sub-topic.

BibTeX template explicitly uses @misc for arXiv preprints (not @article),
which is the most common mistake when generating BibTeX for arXiv papers.

arxiv_search.py was smoke-tested end-to-end against the live arXiv API with
two query shapes (relevance sort, submittedDate sort with category filter);
all returned JSON fields parse correctly (id normalization, Atom namespace
handling, URL encoding for multi-word queries).

* fix(skills): prevent LLM from saving intermediate search results to file

Adds an explicit "do not save" instruction at the end of Phase 2.
Observed during Test 1 with DeepSeek: the model saved search results
to a markdown file before proceeding to Phase 3, wasting 2-3 tool call
rounds and increasing the risk of hitting the graph recursion limit.
The search JSON should stay in context for Phase 3, not be persisted.

* fix(skills): use relevance+start-date instead of submittedDate sorting

Test 2 revealed that arXiv's submittedDate sorting returns the most
recently submitted papers in the category regardless of query relevance.
Searching "diffusion models" with sortBy=submittedDate in cs.CV returned
papers on spatial memory, Navier-Stokes, and photon-counting CT — none
about diffusion models. The LLM then retried with 4 different queries,
wasting tool calls and approaching the recursion limit.

Fix: always sort by relevance; when the user wants "recent" papers,
combine relevance sorting with --start-date to constrain the time window.
Also add an explicit "run the search exactly once" instruction to prevent
the retry loop.

* fix(skills): wrap multi-word arXiv queries in double quotes for phrase matching

Without quotes, `all:diffusion model` is parsed by arXiv's Lucene as
`all:diffusion OR model`, pulling in unrelated papers from physics
(thermal diffusion) and other fields. Wrapping in double quotes forces
phrase matching: `all:"diffusion model"`.

Also fixes date filtering: the previous bug caused 2011 papers to appear
in results despite --start-date 2024-04-09, because the unquoted query
words were OR'd with the date constraint.

Verified: "diffusion models" --category cs.CV --start-date 2024-04-09
now returns only relevant diffusion model papers published after April
2024.

* fix(skills): add query phrasing guide and enforce subagent delegation

Two fixes from Test 2 observations with DeepSeek:

1. Query phrasing: add a table showing good vs bad query examples.
   The script wraps multi-word queries in double quotes for phrase
   matching, so long queries like "diffusion models in computer vision"
   return 0 results. Guide the LLM to use 2-3 core keywords + --category
   instead.

2. Subagent enforcement: DeepSeek was extracting metadata inline via
   python -c scripts instead of using the task tool. Strengthen Phase 3
   to explicitly name the task tool, say "do not extract metadata
   yourself", and explain why (token budget, isolation). This is more
   direct than the previous natural-language-only approach while still
   providing the reasoning behind the constraint.

* fix(skills): strengthen search keyword guidance and subagent enforcement

Address two issues found during end-to-end testing with DeepSeek:

1. Search retry: LLM passed full topic descriptions as queries (e.g.
   "diffusion models in computer vision"), which returned 0 results due
   to exact phrase matching and triggered retries. Added explicit
   instruction to extract 2-3 core keywords before searching.

2. Subagent bypass: LLM used python -c to extract metadata instead of
   dispatching via task tool. Added explicit prohibition list (python -c,
   bash scripts, inline extraction) with ❌ markers for clarity.

* fix(skills): address Copilot review feedback on SLR skill

- Fix legacy arXiv ID parsing: preserve archive prefix for pre-2007
  papers (e.g. hep-th/9901001 instead of just 9901001)
- Fix phase count: "four phases" -> "five phases"
- Add subagent_enabled prerequisite note to SKILL.md Notes section
- Remove PR-specific references ("PR 1") from ieee.md and bibtex.md
  templates, replace with workflow-scoped wording
- Fix script header: "stdlib only" -> "no additional dependencies
  required", fix relative path to github_api.py reference
- Remove reference to non-existent docs/enhancement/ path in header

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

2026-04-10 08:54:28 +08:00

5.9 KiB

Raw Blame History

IEEE Citation Template

Use this template when the user targets an IEEE conference or journal, or explicitly asks for IEEE format. IEEE uses numeric citations — references are numbered in the order they first appear in the text, and in-text citations use bracketed numbers.

Citation Format Rules

In-text citations

Single reference: [1] — use the number assigned in the References section.
Multiple references: [1], [3], [5] or [1]–[3] for consecutive ranges.
Citation as a noun: "As shown in [1], ..." or "Reference [1] demonstrated...".
Author attribution: "Vaswani et al. [1] introduced..." — author names are optional in IEEE; use them when it improves readability, always followed by the bracketed number.

Numbers are assigned in order of first appearance in the text, not alphabetically. The first reference you cite is [1], the second new reference is [2], and so on.

Reference list entry for arXiv preprints

IEEE format for arXiv preprints:

[N] A. A. Author, B. B. Author, and C. C. Author, "Title of the paper," arXiv:ARXIV_ID, Year.

Real example:

[1] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," arXiv:1706.03762, 2017.

Formatting rules:

Author names: FirstInitial. LastName — initials before the last name, opposite of APA. Join with commas; last author gets and (no Oxford comma before it in strict IEEE, but accepted).
Title: in double quotes, sentence case. No italics.
Source: arXiv:<id> — the literal prefix arXiv: followed by the bare id (e.g. arXiv:1706.03762, not the full URL).
Year: at the end, after a comma.
URL: optional in IEEE. Include if the publication venue requires it; otherwise the arXiv:<id> identifier is sufficient and is the IEEE-preferred form.

Special cases

More than 6 authors: IEEE allows listing the first author followed by et al.: A. Vaswani et al., "Attention is all you need," arXiv:1706.03762, 2017. Use this for papers with many authors to keep reference entries readable.
If the paper has also been published at a venue: prefer the venue citation format over arXiv. In this workflow we only have arXiv metadata, so always use the arXiv form.

Report Structure

Follow this structure verbatim. Note that IEEE reports use numeric citations throughout, so you need to assign a number to each paper in order of first appearance in the Themes section, then use those numbers consistently in per-paper annotations and the reference list.

# Systematic Literature Review: <Topic>

**Date**: <YYYY-MM-DD>
**Papers surveyed**: <N>
**Scope**: <arXiv search query, category, time window>
**Citation format**: IEEE

## Executive Summary

<3-5 sentences summarizing the state of the literature. Cite papers with bracketed numbers as you first introduce them, e.g. "Transformer architectures [1] have become the dominant approach, with extensions focusing on efficiency [2], [3] and long-context handling [4].">

## Methodology

This review surveyed <N> arXiv papers retrieved on <YYYY-MM-DD> using the query `<query>`<, filtered to category <cat>><, published between <start_date> and <end_date>>. Papers were sorted by <relevance | submission date> and the top <N> were included. Metadata extraction was performed by language-model agents, with cross-paper synthesis performed by the lead agent.

**Limitations of this review**: arXiv preprints are not peer-reviewed; coverage is limited to arXiv.

## Themes

<3-6 thematic sections. First appearance of each paper gets a bracketed number; subsequent mentions reuse the same number. The number assignment order is: first paper mentioned in Theme 1 gets [1], next new paper gets [2], etc.>

### Theme 1: <Theme name>

<Paragraphs describing the theme. Cite with bracketed numbers: "The original transformer architecture [1] introduced self-attention, which was later extended in [2] and [3]. Comparative analyses [4] show that...">

### Theme 2: <Theme name>

<...>

## Convergences and Disagreements

**Convergences**: <e.g. "Multiple papers [1], [3], [5] agree that X is necessary.">

**Disagreements**: <e.g. "While [1] argues X, [2] finds the opposite under condition Y.">

## Gaps and Open Questions

<What the collective literature does not yet address, with citations to papers that explicitly mention these gaps.>

## Per-Paper Annotations

<One subsection per paper, ordered by their assigned reference number.>

### [1] Vaswani et al., "Attention is all you need" (2017)

**Research question**: <1 sentence>
**Methodology**: <1-2 sentences>
**Key findings**:
- <bullet>
- <bullet>
- <bullet>
**Limitations**: <1-2 sentences>

### [2] <Next paper>

<...>

## References

<Numbered list in order of first appearance in the text. The number must match the in-text citations above.>

[1] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," arXiv:1706.03762, 2017.

[2] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," arXiv:1810.04805, 2018.

<... more entries ...>

Quality checks before finalizing

Before saving the report, verify:

Every paper in the surveyed set has a unique reference number.
Reference numbers are assigned in order of first appearance in the text, not alphabetically.
Every bracketed number in the text has a matching entry in the References section.
Every entry in References is cited at least once in the text.
Author names use FirstInitial. LastName format (initials before last name).
Titles are in double quotes and sentence case.
arXiv identifiers use the arXiv:<bare_id> form, not the full URL.
Per-paper annotations are ordered by reference number, matching the References section order.

5.9 KiB Raw Blame History Unescape Escape