* feat(skills): add systematic-literature-review skill for multi-paper SLR workflows Adds a new skill that produces a structured systematic literature review (SLR) across multiple academic papers on a topic. Addresses #1862 with a pure skill approach: no new tools, no architectural changes, no new dependencies. Skill layout: - SKILL.md — 4+1 phase workflow (plan, search, extract, synthesize, present) - scripts/arxiv_search.py — arXiv API client, stdlib only, with a requests->urllib fallback shim modeled after github-deep-research's github_api.py - templates/{apa,ieee,bibtex}.md — citation format templates selected dynamically in Phase 4, mirroring podcast-generation's templates/ pattern Design notes: - Multi-paper synthesis uses the existing `task` tool to dispatch extraction subagents in parallel. SKILL.md's Phase 3 includes a fixed decision table for batch splitting to respect the runtime's MAX_CONCURRENT_SUBAGENTS = 3 cap, and explicitly tells the agent to strip the "Task Succeeded. Result: " prefix before parsing subagent JSON output. - arXiv only, by design. Semantic Scholar and PubMed adapters would push the scope toward a standalone MCP server (see #933) and are intentionally out of scope for this skill. - Coexists with the existing `academic-paper-review` skill: this skill does breadth-first synthesis across many papers, academic-paper-review does single-paper peer review. The two are routed via distinct triggers and can compose (SLR on many + deep review on 1-2 important ones). - Hard upper bound of 50 papers, tied to the Phase 3 concurrency strategy. Larger surveys degrade in synthesis quality and are better split by sub-topic. BibTeX template explicitly uses @misc for arXiv preprints (not @article), which is the most common mistake when generating BibTeX for arXiv papers. arxiv_search.py was smoke-tested end-to-end against the live arXiv API with two query shapes (relevance sort, submittedDate sort with category filter); all returned JSON fields parse correctly (id normalization, Atom namespace handling, URL encoding for multi-word queries). * fix(skills): prevent LLM from saving intermediate search results to file Adds an explicit "do not save" instruction at the end of Phase 2. Observed during Test 1 with DeepSeek: the model saved search results to a markdown file before proceeding to Phase 3, wasting 2-3 tool call rounds and increasing the risk of hitting the graph recursion limit. The search JSON should stay in context for Phase 3, not be persisted. * fix(skills): use relevance+start-date instead of submittedDate sorting Test 2 revealed that arXiv's submittedDate sorting returns the most recently submitted papers in the category regardless of query relevance. Searching "diffusion models" with sortBy=submittedDate in cs.CV returned papers on spatial memory, Navier-Stokes, and photon-counting CT — none about diffusion models. The LLM then retried with 4 different queries, wasting tool calls and approaching the recursion limit. Fix: always sort by relevance; when the user wants "recent" papers, combine relevance sorting with --start-date to constrain the time window. Also add an explicit "run the search exactly once" instruction to prevent the retry loop. * fix(skills): wrap multi-word arXiv queries in double quotes for phrase matching Without quotes, `all:diffusion model` is parsed by arXiv's Lucene as `all:diffusion OR model`, pulling in unrelated papers from physics (thermal diffusion) and other fields. Wrapping in double quotes forces phrase matching: `all:"diffusion model"`. Also fixes date filtering: the previous bug caused 2011 papers to appear in results despite --start-date 2024-04-09, because the unquoted query words were OR'd with the date constraint. Verified: "diffusion models" --category cs.CV --start-date 2024-04-09 now returns only relevant diffusion model papers published after April 2024. * fix(skills): add query phrasing guide and enforce subagent delegation Two fixes from Test 2 observations with DeepSeek: 1. Query phrasing: add a table showing good vs bad query examples. The script wraps multi-word queries in double quotes for phrase matching, so long queries like "diffusion models in computer vision" return 0 results. Guide the LLM to use 2-3 core keywords + --category instead. 2. Subagent enforcement: DeepSeek was extracting metadata inline via python -c scripts instead of using the task tool. Strengthen Phase 3 to explicitly name the task tool, say "do not extract metadata yourself", and explain why (token budget, isolation). This is more direct than the previous natural-language-only approach while still providing the reasoning behind the constraint. * fix(skills): strengthen search keyword guidance and subagent enforcement Address two issues found during end-to-end testing with DeepSeek: 1. Search retry: LLM passed full topic descriptions as queries (e.g. "diffusion models in computer vision"), which returned 0 results due to exact phrase matching and triggered retries. Added explicit instruction to extract 2-3 core keywords before searching. 2. Subagent bypass: LLM used python -c to extract metadata instead of dispatching via task tool. Added explicit prohibition list (python -c, bash scripts, inline extraction) with ❌ markers for clarity. * fix(skills): address Copilot review feedback on SLR skill - Fix legacy arXiv ID parsing: preserve archive prefix for pre-2007 papers (e.g. hep-th/9901001 instead of just 9901001) - Fix phase count: "four phases" -> "five phases" - Add subagent_enabled prerequisite note to SKILL.md Notes section - Remove PR-specific references ("PR 1") from ieee.md and bibtex.md templates, replace with workflow-scoped wording - Fix script header: "stdlib only" -> "no additional dependencies required", fix relative path to github_api.py reference - Remove reference to non-existent docs/enhancement/ path in header * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
5.9 KiB
IEEE Citation Template
Use this template when the user targets an IEEE conference or journal, or explicitly asks for IEEE format. IEEE uses numeric citations — references are numbered in the order they first appear in the text, and in-text citations use bracketed numbers.
Citation Format Rules
In-text citations
- Single reference:
[1]— use the number assigned in the References section. - Multiple references:
[1], [3], [5]or[1]–[3]for consecutive ranges. - Citation as a noun: "As shown in [1], ..." or "Reference [1] demonstrated...".
- Author attribution: "Vaswani et al. [1] introduced..." — author names are optional in IEEE; use them when it improves readability, always followed by the bracketed number.
Numbers are assigned in order of first appearance in the text, not alphabetically. The first reference you cite is [1], the second new reference is [2], and so on.
Reference list entry for arXiv preprints
IEEE format for arXiv preprints:
[N] A. A. Author, B. B. Author, and C. C. Author, "Title of the paper," arXiv:ARXIV_ID, Year.
Real example:
[1] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," arXiv:1706.03762, 2017.
Formatting rules:
- Author names:
FirstInitial. LastName— initials before the last name, opposite of APA. Join with commas; last author getsand(no Oxford comma before it in strict IEEE, but accepted). - Title: in double quotes, sentence case. No italics.
- Source:
arXiv:<id>— the literal prefixarXiv:followed by the bare id (e.g.arXiv:1706.03762, not the full URL). - Year: at the end, after a comma.
- URL: optional in IEEE. Include if the publication venue requires it; otherwise the
arXiv:<id>identifier is sufficient and is the IEEE-preferred form.
Special cases
- More than 6 authors: IEEE allows listing the first author followed by
et al.:A. Vaswani et al., "Attention is all you need," arXiv:1706.03762, 2017.Use this for papers with many authors to keep reference entries readable. - If the paper has also been published at a venue: prefer the venue citation format over arXiv. In this workflow we only have arXiv metadata, so always use the arXiv form.
Report Structure
Follow this structure verbatim. Note that IEEE reports use numeric citations throughout, so you need to assign a number to each paper in order of first appearance in the Themes section, then use those numbers consistently in per-paper annotations and the reference list.
# Systematic Literature Review: <Topic>
**Date**: <YYYY-MM-DD>
**Papers surveyed**: <N>
**Scope**: <arXiv search query, category, time window>
**Citation format**: IEEE
## Executive Summary
<3-5 sentences summarizing the state of the literature. Cite papers with bracketed numbers as you first introduce them, e.g. "Transformer architectures [1] have become the dominant approach, with extensions focusing on efficiency [2], [3] and long-context handling [4].">
## Methodology
This review surveyed <N> arXiv papers retrieved on <YYYY-MM-DD> using the query `<query>`<, filtered to category <cat>><, published between <start_date> and <end_date>>. Papers were sorted by <relevance | submission date> and the top <N> were included. Metadata extraction was performed by language-model agents, with cross-paper synthesis performed by the lead agent.
**Limitations of this review**: arXiv preprints are not peer-reviewed; coverage is limited to arXiv.
## Themes
<3-6 thematic sections. First appearance of each paper gets a bracketed number; subsequent mentions reuse the same number. The number assignment order is: first paper mentioned in Theme 1 gets [1], next new paper gets [2], etc.>
### Theme 1: <Theme name>
<Paragraphs describing the theme. Cite with bracketed numbers: "The original transformer architecture [1] introduced self-attention, which was later extended in [2] and [3]. Comparative analyses [4] show that...">
### Theme 2: <Theme name>
<...>
## Convergences and Disagreements
**Convergences**: <e.g. "Multiple papers [1], [3], [5] agree that X is necessary.">
**Disagreements**: <e.g. "While [1] argues X, [2] finds the opposite under condition Y.">
## Gaps and Open Questions
<What the collective literature does not yet address, with citations to papers that explicitly mention these gaps.>
## Per-Paper Annotations
<One subsection per paper, ordered by their assigned reference number.>
### [1] Vaswani et al., "Attention is all you need" (2017)
**Research question**: <1 sentence>
**Methodology**: <1-2 sentences>
**Key findings**:
- <bullet>
- <bullet>
- <bullet>
**Limitations**: <1-2 sentences>
### [2] <Next paper>
<...>
## References
<Numbered list in order of first appearance in the text. The number must match the in-text citations above.>
[1] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," arXiv:1706.03762, 2017.
[2] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," arXiv:1810.04805, 2018.
<... more entries ...>
Quality checks before finalizing
Before saving the report, verify:
- Every paper in the surveyed set has a unique reference number.
- Reference numbers are assigned in order of first appearance in the text, not alphabetically.
- Every bracketed number in the text has a matching entry in the References section.
- Every entry in References is cited at least once in the text.
- Author names use
FirstInitial. LastNameformat (initials before last name). - Titles are in double quotes and sentence case.
- arXiv identifiers use the
arXiv:<bare_id>form, not the full URL. - Per-paper annotations are ordered by reference number, matching the References section order.