feat(skills): add systematic-literature-review skill for multi-paper SLR workflows (#2032)

* feat(skills): add systematic-literature-review skill for multi-paper SLR workflows Adds a new skill that produces a structured systematic literature review (SLR) across multiple academic papers on a topic. Addresses #1862 with a pure skill approach: no new tools, no architectural changes, no new dependencies. Skill layout: - SKILL.md — 4+1 phase workflow (plan, search, extract, synthesize, present) - scripts/arxiv_search.py — arXiv API client, stdlib only, with a requests->urllib fallback shim modeled after github-deep-research's github_api.py - templates/{apa,ieee,bibtex}.md — citation format templates selected dynamically in Phase 4, mirroring podcast-generation's templates/ pattern Design notes: - Multi-paper synthesis uses the existing `task` tool to dispatch extraction subagents in parallel. SKILL.md's Phase 3 includes a fixed decision table for batch splitting to respect the runtime's MAX_CONCURRENT_SUBAGENTS = 3 cap, and explicitly tells the agent to strip the "Task Succeeded. Result: " prefix before parsing subagent JSON output. - arXiv only, by design. Semantic Scholar and PubMed adapters would push the scope toward a standalone MCP server (see #933) and are intentionally out of scope for this skill. - Coexists with the existing `academic-paper-review` skill: this skill does breadth-first synthesis across many papers, academic-paper-review does single-paper peer review. The two are routed via distinct triggers and can compose (SLR on many + deep review on 1-2 important ones). - Hard upper bound of 50 papers, tied to the Phase 3 concurrency strategy. Larger surveys degrade in synthesis quality and are better split by sub-topic. BibTeX template explicitly uses @misc for arXiv preprints (not @article), which is the most common mistake when generating BibTeX for arXiv papers. arxiv_search.py was smoke-tested end-to-end against the live arXiv API with two query shapes (relevance sort, submittedDate sort with category filter); all returned JSON fields parse correctly (id normalization, Atom namespace handling, URL encoding for multi-word queries). * fix(skills): prevent LLM from saving intermediate search results to file Adds an explicit "do not save" instruction at the end of Phase 2. Observed during Test 1 with DeepSeek: the model saved search results to a markdown file before proceeding to Phase 3, wasting 2-3 tool call rounds and increasing the risk of hitting the graph recursion limit. The search JSON should stay in context for Phase 3, not be persisted. * fix(skills): use relevance+start-date instead of submittedDate sorting Test 2 revealed that arXiv's submittedDate sorting returns the most recently submitted papers in the category regardless of query relevance. Searching "diffusion models" with sortBy=submittedDate in cs.CV returned papers on spatial memory, Navier-Stokes, and photon-counting CT — none about diffusion models. The LLM then retried with 4 different queries, wasting tool calls and approaching the recursion limit. Fix: always sort by relevance; when the user wants "recent" papers, combine relevance sorting with --start-date to constrain the time window. Also add an explicit "run the search exactly once" instruction to prevent the retry loop. * fix(skills): wrap multi-word arXiv queries in double quotes for phrase matching Without quotes, `all:diffusion model` is parsed by arXiv's Lucene as `all:diffusion OR model`, pulling in unrelated papers from physics (thermal diffusion) and other fields. Wrapping in double quotes forces phrase matching: `all:"diffusion model"`. Also fixes date filtering: the previous bug caused 2011 papers to appear in results despite --start-date 2024-04-09, because the unquoted query words were OR'd with the date constraint. Verified: "diffusion models" --category cs.CV --start-date 2024-04-09 now returns only relevant diffusion model papers published after April 2024. * fix(skills): add query phrasing guide and enforce subagent delegation Two fixes from Test 2 observations with DeepSeek: 1. Query phrasing: add a table showing good vs bad query examples. The script wraps multi-word queries in double quotes for phrase matching, so long queries like "diffusion models in computer vision" return 0 results. Guide the LLM to use 2-3 core keywords + --category instead. 2. Subagent enforcement: DeepSeek was extracting metadata inline via python -c scripts instead of using the task tool. Strengthen Phase 3 to explicitly name the task tool, say "do not extract metadata yourself", and explain why (token budget, isolation). This is more direct than the previous natural-language-only approach while still providing the reasoning behind the constraint. * fix(skills): strengthen search keyword guidance and subagent enforcement Address two issues found during end-to-end testing with DeepSeek: 1. Search retry: LLM passed full topic descriptions as queries (e.g. "diffusion models in computer vision"), which returned 0 results due to exact phrase matching and triggered retries. Added explicit instruction to extract 2-3 core keywords before searching. 2. Subagent bypass: LLM used python -c to extract metadata instead of dispatching via task tool. Added explicit prohibition list (python -c, bash scripts, inline extraction) with ❌ markers for clarity. * fix(skills): address Copilot review feedback on SLR skill - Fix legacy arXiv ID parsing: preserve archive prefix for pre-2007 papers (e.g. hep-th/9901001 instead of just 9901001) - Fix phase count: "four phases" -> "five phases" - Add subagent_enabled prerequisite note to SKILL.md Notes section - Remove PR-specific references ("PR 1") from ieee.md and bibtex.md templates, replace with workflow-scoped wording - Fix script header: "stdlib only" -> "no additional dependencies required", fix relative path to github_api.py reference - Remove reference to non-existent docs/enhancement/ path in header * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-05-26 01:46:01 +00:00 · 2026-04-10 08:54:28 +08:00
parent 133ffe7174
commit 16aa51c9b3
5 changed files with 949 additions and 0 deletions
@@ -0,0 +1,127 @@
+# IEEE Citation Template
+
+Use this template when the user targets an IEEE conference or journal, or explicitly asks for IEEE format. IEEE uses **numeric citations** — references are numbered in the order they first appear in the text, and in-text citations use bracketed numbers.
+
+## Citation Format Rules
+
+### In-text citations
+
+- **Single reference**: `[1]` — use the number assigned in the References section.
+- **Multiple references**: `[1], [3], [5]` or `[1]–[3]` for consecutive ranges.
+- **Citation as a noun**: "As shown in [1], ..." or "Reference [1] demonstrated...".
+- **Author attribution**: "Vaswani et al. [1] introduced..." — author names are optional in IEEE; use them when it improves readability, always followed by the bracketed number.
+
+Numbers are assigned in **order of first appearance in the text**, not alphabetically. The first reference you cite is `[1]`, the second new reference is `[2]`, and so on.
+
+### Reference list entry for arXiv preprints
+
+IEEE format for arXiv preprints:
+
+```
+[N] A. A. Author, B. B. Author, and C. C. Author, "Title of the paper," arXiv:ARXIV_ID, Year.
+```
+
+**Real example**:
+
+```
+[1] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," arXiv:1706.03762, 2017.
+```
+
+Formatting rules:
+
+- **Author names**: `FirstInitial. LastName` — initials before the last name, opposite of APA. Join with commas; last author gets `and` (no Oxford comma before it in strict IEEE, but accepted).
+- **Title**: in double quotes, sentence case. No italics.
+- **Source**: `arXiv:<id>` — the literal prefix `arXiv:` followed by the bare id (e.g. `arXiv:1706.03762`, not the full URL).
+- **Year**: at the end, after a comma.
+- **URL**: optional in IEEE. Include if the publication venue requires it; otherwise the `arXiv:<id>` identifier is sufficient and is the IEEE-preferred form.
+
+### Special cases
+
+- **More than 6 authors**: IEEE allows listing the first author followed by `et al.`: `A. Vaswani et al., "Attention is all you need," arXiv:1706.03762, 2017.` Use this for papers with many authors to keep reference entries readable.
+- **If the paper has also been published at a venue**: prefer the venue citation format over arXiv. In this workflow we only have arXiv metadata, so always use the arXiv form.
+
+## Report Structure
+
+Follow this structure verbatim. Note that IEEE reports use **numeric citations throughout**, so you need to assign a number to each paper **in order of first appearance** in the Themes section, then use those numbers consistently in per-paper annotations and the reference list.
+
+```markdown
+# Systematic Literature Review: <Topic>
+
+**Date**: <YYYY-MM-DD>
+**Papers surveyed**: <N>
+**Scope**: <arXiv search query, category, time window>
+**Citation format**: IEEE
+
+## Executive Summary
+
+<3-5 sentences summarizing the state of the literature. Cite papers with bracketed numbers as you first introduce them, e.g. "Transformer architectures [1] have become the dominant approach, with extensions focusing on efficiency [2], [3] and long-context handling [4].">
+
+## Methodology
+
+This review surveyed <N> arXiv papers retrieved on <YYYY-MM-DD> using the query `<query>`<, filtered to category <cat>><, published between <start_date> and <end_date>>. Papers were sorted by <relevance | submission date> and the top <N> were included. Metadata extraction was performed by language-model agents, with cross-paper synthesis performed by the lead agent.
+
+**Limitations of this review**: arXiv preprints are not peer-reviewed; coverage is limited to arXiv.
+
+## Themes
+
+<3-6 thematic sections. First appearance of each paper gets a bracketed number; subsequent mentions reuse the same number. The number assignment order is: first paper mentioned in Theme 1 gets [1], next new paper gets [2], etc.>
+
+### Theme 1: <Theme name>
+
+<Paragraphs describing the theme. Cite with bracketed numbers: "The original transformer architecture [1] introduced self-attention, which was later extended in [2] and [3]. Comparative analyses [4] show that...">
+
+### Theme 2: <Theme name>
+
+<...>
+
+## Convergences and Disagreements
+
+**Convergences**: <e.g. "Multiple papers [1], [3], [5] agree that X is necessary.">
+
+**Disagreements**: <e.g. "While [1] argues X, [2] finds the opposite under condition Y.">
+
+## Gaps and Open Questions
+
+<What the collective literature does not yet address, with citations to papers that explicitly mention these gaps.>
+
+## Per-Paper Annotations
+
+<One subsection per paper, ordered by their assigned reference number.>
+
+### [1] Vaswani et al., "Attention is all you need" (2017)
+
+**Research question**: <1 sentence>
+**Methodology**: <1-2 sentences>
+**Key findings**:
+- <bullet>
+- <bullet>
+- <bullet>
+**Limitations**: <1-2 sentences>
+
+### [2] <Next paper>
+
+<...>
+
+## References
+
+<Numbered list in order of first appearance in the text. The number must match the in-text citations above.>
+
+[1] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," arXiv:1706.03762, 2017.
+
+[2] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," arXiv:1810.04805, 2018.
+
+<... more entries ...>
+```
+
+## Quality checks before finalizing
+
+Before saving the report, verify:
+
+- [ ] Every paper in the surveyed set has a unique reference number.
+- [ ] Reference numbers are assigned in order of **first appearance in the text**, not alphabetically.
+- [ ] Every bracketed number in the text has a matching entry in the References section.
+- [ ] Every entry in References is cited at least once in the text.
+- [ ] Author names use `FirstInitial. LastName` format (initials before last name).
+- [ ] Titles are in double quotes and sentence case.
+- [ ] arXiv identifiers use the `arXiv:<bare_id>` form, not the full URL.
+- [ ] Per-paper annotations are ordered by reference number, matching the References section order.