16aa51c9b3
* feat(skills): add systematic-literature-review skill for multi-paper SLR workflows Adds a new skill that produces a structured systematic literature review (SLR) across multiple academic papers on a topic. Addresses #1862 with a pure skill approach: no new tools, no architectural changes, no new dependencies. Skill layout: - SKILL.md — 4+1 phase workflow (plan, search, extract, synthesize, present) - scripts/arxiv_search.py — arXiv API client, stdlib only, with a requests->urllib fallback shim modeled after github-deep-research's github_api.py - templates/{apa,ieee,bibtex}.md — citation format templates selected dynamically in Phase 4, mirroring podcast-generation's templates/ pattern Design notes: - Multi-paper synthesis uses the existing `task` tool to dispatch extraction subagents in parallel. SKILL.md's Phase 3 includes a fixed decision table for batch splitting to respect the runtime's MAX_CONCURRENT_SUBAGENTS = 3 cap, and explicitly tells the agent to strip the "Task Succeeded. Result: " prefix before parsing subagent JSON output. - arXiv only, by design. Semantic Scholar and PubMed adapters would push the scope toward a standalone MCP server (see #933) and are intentionally out of scope for this skill. - Coexists with the existing `academic-paper-review` skill: this skill does breadth-first synthesis across many papers, academic-paper-review does single-paper peer review. The two are routed via distinct triggers and can compose (SLR on many + deep review on 1-2 important ones). - Hard upper bound of 50 papers, tied to the Phase 3 concurrency strategy. Larger surveys degrade in synthesis quality and are better split by sub-topic. BibTeX template explicitly uses @misc for arXiv preprints (not @article), which is the most common mistake when generating BibTeX for arXiv papers. arxiv_search.py was smoke-tested end-to-end against the live arXiv API with two query shapes (relevance sort, submittedDate sort with category filter); all returned JSON fields parse correctly (id normalization, Atom namespace handling, URL encoding for multi-word queries). * fix(skills): prevent LLM from saving intermediate search results to file Adds an explicit "do not save" instruction at the end of Phase 2. Observed during Test 1 with DeepSeek: the model saved search results to a markdown file before proceeding to Phase 3, wasting 2-3 tool call rounds and increasing the risk of hitting the graph recursion limit. The search JSON should stay in context for Phase 3, not be persisted. * fix(skills): use relevance+start-date instead of submittedDate sorting Test 2 revealed that arXiv's submittedDate sorting returns the most recently submitted papers in the category regardless of query relevance. Searching "diffusion models" with sortBy=submittedDate in cs.CV returned papers on spatial memory, Navier-Stokes, and photon-counting CT — none about diffusion models. The LLM then retried with 4 different queries, wasting tool calls and approaching the recursion limit. Fix: always sort by relevance; when the user wants "recent" papers, combine relevance sorting with --start-date to constrain the time window. Also add an explicit "run the search exactly once" instruction to prevent the retry loop. * fix(skills): wrap multi-word arXiv queries in double quotes for phrase matching Without quotes, `all:diffusion model` is parsed by arXiv's Lucene as `all:diffusion OR model`, pulling in unrelated papers from physics (thermal diffusion) and other fields. Wrapping in double quotes forces phrase matching: `all:"diffusion model"`. Also fixes date filtering: the previous bug caused 2011 papers to appear in results despite --start-date 2024-04-09, because the unquoted query words were OR'd with the date constraint. Verified: "diffusion models" --category cs.CV --start-date 2024-04-09 now returns only relevant diffusion model papers published after April 2024. * fix(skills): add query phrasing guide and enforce subagent delegation Two fixes from Test 2 observations with DeepSeek: 1. Query phrasing: add a table showing good vs bad query examples. The script wraps multi-word queries in double quotes for phrase matching, so long queries like "diffusion models in computer vision" return 0 results. Guide the LLM to use 2-3 core keywords + --category instead. 2. Subagent enforcement: DeepSeek was extracting metadata inline via python -c scripts instead of using the task tool. Strengthen Phase 3 to explicitly name the task tool, say "do not extract metadata yourself", and explain why (token budget, isolation). This is more direct than the previous natural-language-only approach while still providing the reasoning behind the constraint. * fix(skills): strengthen search keyword guidance and subagent enforcement Address two issues found during end-to-end testing with DeepSeek: 1. Search retry: LLM passed full topic descriptions as queries (e.g. "diffusion models in computer vision"), which returned 0 results due to exact phrase matching and triggered retries. Added explicit instruction to extract 2-3 core keywords before searching. 2. Subagent bypass: LLM used python -c to extract metadata instead of dispatching via task tool. Added explicit prohibition list (python -c, bash scripts, inline extraction) with ❌ markers for clarity. * fix(skills): address Copilot review feedback on SLR skill - Fix legacy arXiv ID parsing: preserve archive prefix for pre-2007 papers (e.g. hep-th/9901001 instead of just 9901001) - Fix phase count: "four phases" -> "five phases" - Add subagent_enabled prerequisite note to SKILL.md Notes section - Remove PR-specific references ("PR 1") from ieee.md and bibtex.md templates, replace with workflow-scoped wording - Fix script header: "stdlib only" -> "no additional dependencies required", fix relative path to github_api.py reference - Remove reference to non-existent docs/enhancement/ path in header * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
6.0 KiB
6.0 KiB
APA 7th Edition Citation Template
Use this template when the user requests APA format, or when they do not specify a format. APA 7th is the default for social sciences and most CS journals outside of IEEE venues.
Citation Format Rules
In-text citations
- Single author:
(Vaswani, 2017)orVaswani (2017) showed that... - Two authors:
(Vaswani & Shazeer, 2017)— use&inside parentheses, "and" in running text. - Three or more authors:
(Vaswani et al., 2017)— useet al.from the first citation onward (APA 7th changed this from APA 6th). - Multiple citations:
(Vaswani et al., 2017; Devlin et al., 2018)— alphabetical order, separated by semicolons.
Reference list entry for arXiv preprints
arXiv papers are preprints, not formally published articles. Cite them as preprints with the arXiv identifier:
Author, A. A., Author, B. B., & Author, C. C. (Year). Title of the paper. arXiv. https://arxiv.org/abs/ARXIV_ID
Real example (from paper metadata {id: "1706.03762", title: "Attention Is All You Need", authors: ["Ashish Vaswani", "Noam Shazeer", "Niki Parmar", "Jakob Uszkoreit", "Llion Jones", "Aidan N. Gomez", "Łukasz Kaiser", "Illia Polosukhin"], published: "2017-06-12"}):
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. arXiv. https://arxiv.org/abs/1706.03762
Formatting rules:
- Author names:
LastName, FirstInitial.(middle initial optional). Join with commas; last author gets an&. - Year: the
publishedfield's year, in parentheses. - Title: sentence case (only first word and proper nouns capitalized). Italicize titles in typeset output; in plain markdown, leave plain.
- Source: the literal word
arXiv, then the full abs URL. - No DOI unless the paper has also been published in a venue with a DOI. arXiv alone uses the URL.
Special cases
- Up to 20 authors: list all of them separated by commas, with
&before the last. - 21 or more authors: list the first 19, then
..., then the final author. - No DOI and no URL: not possible for arXiv papers; always use the
abs_urlfrom the paper metadata.
Report Structure
Follow this structure verbatim when writing the SLR report body. Fill in content from your Phase 3 extraction and Phase 4 synthesis.
# Systematic Literature Review: <Topic>
**Date**: <YYYY-MM-DD>
**Papers surveyed**: <N>
**Scope**: <arXiv search query, category, time window>
**Citation format**: APA 7th edition
## Executive Summary
<3-5 sentences summarizing the state of the literature on this topic. What do the surveyed papers collectively tell us? What is the shape of the field? Avoid listing papers — synthesize.>
## Methodology
This review surveyed <N> arXiv papers retrieved on <YYYY-MM-DD> using the query `<query>`<, filtered to category <cat>><, published between <start_date> and <end_date>>. Papers were sorted by <relevance | submission date> and the top <N> were included. Metadata extraction (research question, methodology, key findings, limitations) was performed by language-model agents, with cross-paper synthesis performed by the lead agent.
**Limitations of this review**: arXiv preprints are not peer-reviewed; some included papers may not reflect their final published form. Coverage is limited to arXiv — papers published directly in venues without arXiv preprints are not represented.
## Themes
<3-6 thematic sections. Each theme is a recurring research direction, problem framing, or methodological approach across the surveyed papers.>
### Theme 1: <Theme name>
<2-4 paragraphs describing this theme. Cite papers inline as you discuss them, e.g. "Vaswani et al. (2017) introduced X, while subsequent work (Devlin et al., 2018; Liu et al., 2019) extended it to Y." Do not just list papers — describe the intellectual thread that connects them.>
### Theme 2: <Theme name>
<...>
## Convergences and Disagreements
**Convergences**: <findings that multiple papers agree on — e.g. "Most surveyed papers agree that X is necessary, citing evidence from Y and Z.">
**Disagreements**: <where papers reach different conclusions — e.g. "Vaswani et al. (2017) argue that X, while Dai et al. (2019) find the opposite under condition Y.">
## Gaps and Open Questions
<What the collective literature does not yet address. Pull from the "limitations" field of your Phase 3 extraction and identify patterns — if 5 papers all mention the same missing piece, that is a gap worth flagging.>
## Per-Paper Annotations
<One subsection per paper, ordered by year then first author. Each subsection is a mini-summary of that paper's contribution.>
### Vaswani et al. (2017)
**Research question**: <1 sentence from Phase 3 metadata>
**Methodology**: <1-2 sentences>
**Key findings**:
- <bullet>
- <bullet>
- <bullet>
**Limitations**: <1-2 sentences>
### <Next paper>
<...>
## References
<Alphabetical list by first author's last name, APA 7th format as described above.>
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv. https://arxiv.org/abs/1810.04805
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. arXiv. https://arxiv.org/abs/1706.03762
<... more entries, one per paper ...>
Quality checks before finalizing
Before saving the report, verify:
- Every paper in the surveyed set appears both in "Per-Paper Annotations" and in "References".
- Every in-text citation matches a reference entry (no dangling citations).
- Authors are formatted
LastName, FirstInitial.— notFirstName LastName. - Years are in parentheses inline, and at the start of reference entries.
- Titles are in sentence case in references (only first word + proper nouns capitalized).
- arXiv URLs use the
abs_urlform (https://arxiv.org/abs/...), notpdf_url. - References are alphabetized by first author's last name.