mirror of
https://github.com/bytedance/deer-flow.git
synced 2026-05-24 17:06:00 +00:00
16aa51c9b3
* feat(skills): add systematic-literature-review skill for multi-paper SLR workflows Adds a new skill that produces a structured systematic literature review (SLR) across multiple academic papers on a topic. Addresses #1862 with a pure skill approach: no new tools, no architectural changes, no new dependencies. Skill layout: - SKILL.md — 4+1 phase workflow (plan, search, extract, synthesize, present) - scripts/arxiv_search.py — arXiv API client, stdlib only, with a requests->urllib fallback shim modeled after github-deep-research's github_api.py - templates/{apa,ieee,bibtex}.md — citation format templates selected dynamically in Phase 4, mirroring podcast-generation's templates/ pattern Design notes: - Multi-paper synthesis uses the existing `task` tool to dispatch extraction subagents in parallel. SKILL.md's Phase 3 includes a fixed decision table for batch splitting to respect the runtime's MAX_CONCURRENT_SUBAGENTS = 3 cap, and explicitly tells the agent to strip the "Task Succeeded. Result: " prefix before parsing subagent JSON output. - arXiv only, by design. Semantic Scholar and PubMed adapters would push the scope toward a standalone MCP server (see #933) and are intentionally out of scope for this skill. - Coexists with the existing `academic-paper-review` skill: this skill does breadth-first synthesis across many papers, academic-paper-review does single-paper peer review. The two are routed via distinct triggers and can compose (SLR on many + deep review on 1-2 important ones). - Hard upper bound of 50 papers, tied to the Phase 3 concurrency strategy. Larger surveys degrade in synthesis quality and are better split by sub-topic. BibTeX template explicitly uses @misc for arXiv preprints (not @article), which is the most common mistake when generating BibTeX for arXiv papers. arxiv_search.py was smoke-tested end-to-end against the live arXiv API with two query shapes (relevance sort, submittedDate sort with category filter); all returned JSON fields parse correctly (id normalization, Atom namespace handling, URL encoding for multi-word queries). * fix(skills): prevent LLM from saving intermediate search results to file Adds an explicit "do not save" instruction at the end of Phase 2. Observed during Test 1 with DeepSeek: the model saved search results to a markdown file before proceeding to Phase 3, wasting 2-3 tool call rounds and increasing the risk of hitting the graph recursion limit. The search JSON should stay in context for Phase 3, not be persisted. * fix(skills): use relevance+start-date instead of submittedDate sorting Test 2 revealed that arXiv's submittedDate sorting returns the most recently submitted papers in the category regardless of query relevance. Searching "diffusion models" with sortBy=submittedDate in cs.CV returned papers on spatial memory, Navier-Stokes, and photon-counting CT — none about diffusion models. The LLM then retried with 4 different queries, wasting tool calls and approaching the recursion limit. Fix: always sort by relevance; when the user wants "recent" papers, combine relevance sorting with --start-date to constrain the time window. Also add an explicit "run the search exactly once" instruction to prevent the retry loop. * fix(skills): wrap multi-word arXiv queries in double quotes for phrase matching Without quotes, `all:diffusion model` is parsed by arXiv's Lucene as `all:diffusion OR model`, pulling in unrelated papers from physics (thermal diffusion) and other fields. Wrapping in double quotes forces phrase matching: `all:"diffusion model"`. Also fixes date filtering: the previous bug caused 2011 papers to appear in results despite --start-date 2024-04-09, because the unquoted query words were OR'd with the date constraint. Verified: "diffusion models" --category cs.CV --start-date 2024-04-09 now returns only relevant diffusion model papers published after April 2024. * fix(skills): add query phrasing guide and enforce subagent delegation Two fixes from Test 2 observations with DeepSeek: 1. Query phrasing: add a table showing good vs bad query examples. The script wraps multi-word queries in double quotes for phrase matching, so long queries like "diffusion models in computer vision" return 0 results. Guide the LLM to use 2-3 core keywords + --category instead. 2. Subagent enforcement: DeepSeek was extracting metadata inline via python -c scripts instead of using the task tool. Strengthen Phase 3 to explicitly name the task tool, say "do not extract metadata yourself", and explain why (token budget, isolation). This is more direct than the previous natural-language-only approach while still providing the reasoning behind the constraint. * fix(skills): strengthen search keyword guidance and subagent enforcement Address two issues found during end-to-end testing with DeepSeek: 1. Search retry: LLM passed full topic descriptions as queries (e.g. "diffusion models in computer vision"), which returned 0 results due to exact phrase matching and triggered retries. Added explicit instruction to extract 2-3 core keywords before searching. 2. Subagent bypass: LLM used python -c to extract metadata instead of dispatching via task tool. Added explicit prohibition list (python -c, bash scripts, inline extraction) with ❌ markers for clarity. * fix(skills): address Copilot review feedback on SLR skill - Fix legacy arXiv ID parsing: preserve archive prefix for pre-2007 papers (e.g. hep-th/9901001 instead of just 9901001) - Fix phase count: "four phases" -> "five phases" - Add subagent_enabled prerequisite note to SKILL.md Notes section - Remove PR-specific references ("PR 1") from ieee.md and bibtex.md templates, replace with workflow-scoped wording - Fix script header: "stdlib only" -> "no additional dependencies required", fix relative path to github_api.py reference - Remove reference to non-existent docs/enhancement/ path in header * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
126 lines
6.0 KiB
Markdown
126 lines
6.0 KiB
Markdown
# APA 7th Edition Citation Template
|
|
|
|
Use this template when the user requests APA format, or when they do not specify a format. APA 7th is the default for social sciences and most CS journals outside of IEEE venues.
|
|
|
|
## Citation Format Rules
|
|
|
|
### In-text citations
|
|
|
|
- **Single author**: `(Vaswani, 2017)` or `Vaswani (2017) showed that...`
|
|
- **Two authors**: `(Vaswani & Shazeer, 2017)` — use `&` inside parentheses, "and" in running text.
|
|
- **Three or more authors**: `(Vaswani et al., 2017)` — use `et al.` from the first citation onward (APA 7th changed this from APA 6th).
|
|
- **Multiple citations**: `(Vaswani et al., 2017; Devlin et al., 2018)` — alphabetical order, separated by semicolons.
|
|
|
|
### Reference list entry for arXiv preprints
|
|
|
|
arXiv papers are preprints, not formally published articles. Cite them as preprints with the arXiv identifier:
|
|
|
|
```
|
|
Author, A. A., Author, B. B., & Author, C. C. (Year). Title of the paper. arXiv. https://arxiv.org/abs/ARXIV_ID
|
|
```
|
|
|
|
**Real example** (from paper metadata `{id: "1706.03762", title: "Attention Is All You Need", authors: ["Ashish Vaswani", "Noam Shazeer", "Niki Parmar", "Jakob Uszkoreit", "Llion Jones", "Aidan N. Gomez", "Łukasz Kaiser", "Illia Polosukhin"], published: "2017-06-12"}`):
|
|
|
|
```
|
|
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. arXiv. https://arxiv.org/abs/1706.03762
|
|
```
|
|
|
|
Formatting rules:
|
|
|
|
- **Author names**: `LastName, FirstInitial.` (middle initial optional). Join with commas; last author gets an `&`.
|
|
- **Year**: the `published` field's year, in parentheses.
|
|
- **Title**: sentence case (only first word and proper nouns capitalized). Italicize titles in typeset output; in plain markdown, leave plain.
|
|
- **Source**: the literal word `arXiv`, then the full abs URL.
|
|
- **No DOI** unless the paper has also been published in a venue with a DOI. arXiv alone uses the URL.
|
|
|
|
### Special cases
|
|
|
|
- **Up to 20 authors**: list all of them separated by commas, with `&` before the last.
|
|
- **21 or more authors**: list the first 19, then `...`, then the final author.
|
|
- **No DOI and no URL**: not possible for arXiv papers; always use the `abs_url` from the paper metadata.
|
|
|
|
## Report Structure
|
|
|
|
Follow this structure verbatim when writing the SLR report body. Fill in content from your Phase 3 extraction and Phase 4 synthesis.
|
|
|
|
```markdown
|
|
# Systematic Literature Review: <Topic>
|
|
|
|
**Date**: <YYYY-MM-DD>
|
|
**Papers surveyed**: <N>
|
|
**Scope**: <arXiv search query, category, time window>
|
|
**Citation format**: APA 7th edition
|
|
|
|
## Executive Summary
|
|
|
|
<3-5 sentences summarizing the state of the literature on this topic. What do the surveyed papers collectively tell us? What is the shape of the field? Avoid listing papers — synthesize.>
|
|
|
|
## Methodology
|
|
|
|
This review surveyed <N> arXiv papers retrieved on <YYYY-MM-DD> using the query `<query>`<, filtered to category <cat>><, published between <start_date> and <end_date>>. Papers were sorted by <relevance | submission date> and the top <N> were included. Metadata extraction (research question, methodology, key findings, limitations) was performed by language-model agents, with cross-paper synthesis performed by the lead agent.
|
|
|
|
**Limitations of this review**: arXiv preprints are not peer-reviewed; some included papers may not reflect their final published form. Coverage is limited to arXiv — papers published directly in venues without arXiv preprints are not represented.
|
|
|
|
## Themes
|
|
|
|
<3-6 thematic sections. Each theme is a recurring research direction, problem framing, or methodological approach across the surveyed papers.>
|
|
|
|
### Theme 1: <Theme name>
|
|
|
|
<2-4 paragraphs describing this theme. Cite papers inline as you discuss them, e.g. "Vaswani et al. (2017) introduced X, while subsequent work (Devlin et al., 2018; Liu et al., 2019) extended it to Y." Do not just list papers — describe the intellectual thread that connects them.>
|
|
|
|
### Theme 2: <Theme name>
|
|
|
|
<...>
|
|
|
|
## Convergences and Disagreements
|
|
|
|
**Convergences**: <findings that multiple papers agree on — e.g. "Most surveyed papers agree that X is necessary, citing evidence from Y and Z.">
|
|
|
|
**Disagreements**: <where papers reach different conclusions — e.g. "Vaswani et al. (2017) argue that X, while Dai et al. (2019) find the opposite under condition Y.">
|
|
|
|
## Gaps and Open Questions
|
|
|
|
<What the collective literature does not yet address. Pull from the "limitations" field of your Phase 3 extraction and identify patterns — if 5 papers all mention the same missing piece, that is a gap worth flagging.>
|
|
|
|
## Per-Paper Annotations
|
|
|
|
<One subsection per paper, ordered by year then first author. Each subsection is a mini-summary of that paper's contribution.>
|
|
|
|
### Vaswani et al. (2017)
|
|
|
|
**Research question**: <1 sentence from Phase 3 metadata>
|
|
**Methodology**: <1-2 sentences>
|
|
**Key findings**:
|
|
- <bullet>
|
|
- <bullet>
|
|
- <bullet>
|
|
**Limitations**: <1-2 sentences>
|
|
|
|
### <Next paper>
|
|
|
|
<...>
|
|
|
|
## References
|
|
|
|
<Alphabetical list by first author's last name, APA 7th format as described above.>
|
|
|
|
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv. https://arxiv.org/abs/1810.04805
|
|
|
|
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. arXiv. https://arxiv.org/abs/1706.03762
|
|
|
|
<... more entries, one per paper ...>
|
|
```
|
|
|
|
## Quality checks before finalizing
|
|
|
|
Before saving the report, verify:
|
|
|
|
- [ ] Every paper in the surveyed set appears **both** in "Per-Paper Annotations" **and** in "References".
|
|
- [ ] Every in-text citation matches a reference entry (no dangling citations).
|
|
- [ ] Authors are formatted `LastName, FirstInitial.` — not `FirstName LastName`.
|
|
- [ ] Years are in parentheses inline, and at the start of reference entries.
|
|
- [ ] Titles are in sentence case in references (only first word + proper nouns capitalized).
|
|
- [ ] arXiv URLs use the `abs_url` form (`https://arxiv.org/abs/...`), not `pdf_url`.
|
|
- [ ] References are alphabetized by first author's last name.
|