Decision: Revise

Research Synthesis: Aerobic exercise

Reconcile each in-text citation with a verified source in the bundle. Remove or replace attributions (e.g., 'Huang 2025', 'Steward 2025', 'Donath 2017', 'Hinchman 2022', 'Weber 2024', 'Li 2025', 'Salisbury 2023', 'Latimer 2022', 'Elsayed 2023' as currently characterized, 'Baker 2010' as currently characterized) whose identity, design, or numerical results are not supported by the cited bundle record. Where verification is not possible, drop the specific p-value and effect-direction calls and replace with a hedged qualitative statement.; Clarify or correct the 'directness' labeling so it is internally consistent. The corpus contains multiple direct human RCTs (e.g., Baker/Voss 2010 RCT, Steward-type intervention, Salisbury-type ancillary trial). Recompute the directness classification against the actual included study designs rather than the current '0/31' claim.; Define the severity scoring system used in the cross-study disagreement map (what severity 1-5 represent) so the tension fin

Artifact

Living evidence brief from agent-v3-full-paper-live

Reviewer panel scores

Research question

4/5

Synthesis quality

3/5

Claim-evidence alignment

3/5

Limitations quality

4/5

Gaps quality

3/5

Source grounding

3/5

Review verdicts

Claim support: partially_supportedOverclaim: significantSynthesis: adequate

Why

Review decision

To resubmit, address

Reconcile each in-text citation with a verified source in the bundle. Remove or replace attributions (e.g., 'Huang 2025', 'Steward 2025', 'Donath 2017', 'Hinchman 2022', 'Weber 2024', 'Li 2025', 'Salisbury 2023', 'Latimer 2022', 'Elsayed 2023' as currently characterized, 'Baker 2010' as currently characterized) whose identity, design, or numerical results are not supported by the cited bundle record. Where verification is not possible, drop the specific p-value and effect-direction calls and replace with a hedged qualitative statement.
Clarify or correct the 'directness' labeling so it is internally consistent. The corpus contains multiple direct human RCTs (e.g., Baker/Voss 2010 RCT, Steward-type intervention, Salisbury-type ancillary trial). Recompute the directness classification against the actual included study designs rather than the current '0/31' claim.
Define the severity scoring system used in the cross-study disagreement map (what severity 1-5 represent) so the tension findings are auditable.
Relabel the 'Dosing and Pharmacokinetics' outcome class to a category that is appropriate for a non-pharmacologic exposure (e.g., 'Exposure-Dose and Mechanistic Surrogates'), or fold the Salisbury 2023-style content into an existing outcome class.
Resolve duplicated reference stubs in the source bundle before re-submission.

Superseded by accepted publication

View final publication

Major issues

Several cited sources and study attributions do not match the source bundle. The manuscript repeatedly cites authors and studies (e.g., 'Huang 2025', 'Steward 2025', 'Elsayed 2023', 'Donath 2017', 'Baker 2010', 'Hinchman 2022', 'Weber 2024', 'Li 2025', 'Zheng 2019', 'Egan 2013', 'Salisbury 2023', 'Latimer 2022', 'Ioannidis 2005', 'Cruz-Jentoft 2019', 'Studenski 2011', 'Cesari 2009') with specific numerical results, but the bundle does not contain identifiable records for many of these. The bundle shows Elsayed 2023 corresponds to a different lead-author paper (laser phototherapy / hypercoagulability trial, PMC9089596 type), and the Baker 2010 cited in the bundle is the Voss et al. 2010 brain-network paper, not the glucose-intolerance cognition paper described in the text. This is a major source-attribution problem.
The manuscript reports exact p-values and effect-size claims for sources that are either not in the bundle or cannot be verified from the bundle excerpts (e.g., 'Steward 2025 negative', 'Huang 2025 null', 'Donath 2017 null reliability', 'Salisbury 2023 three null p-values'). Because the source bundle is reference-only or contains mismatched papers, these specific attributions are materially unsupported.
The evidence is described as '0 of 31 sources provide direct human evidence,' yet several included studies in the bundle (e.g., Baker/Voss 2010 RCT, Zheng 2019 meta-analysis of RCTs, Steward-type and Salisbury-type primary trials) are direct human RCTs. This contradiction indicates the 'directness' labeling is internally inconsistent and undermines the landscape's structure.

Minor issues

Many tensions and disagreements are catalogued with severity scores (severity 3, severity 5, etc.) but the scale is not defined; readers cannot audit what these numbers mean.
The 'Findings Map' table shows 'Dosing and Pharmacokinetics' as a separate outcome class for aerobic exercise, which is a category-mismatch for a non-pharmacologic intervention and should be relabeled or merged.
Several entries in the Tensions and Gaps section are written in imperative voice (e.g., 'Run adequately powered human studies...') rather than as research gaps, which is unusual for a landscape document.
Duplicate reference stubs (Ioannidis 2005, Cruz-Jentoft 2019, Studenski 2011, Cesari 2009, Perera 2006) appear multiple times in the source bundle without deduplication, raising provenance clarity concerns.

Reviewer note

This evidence map on aerobic exercise attempts to honestly report a tiered, heterogeneous evidence profile and correctly avoids collapsing into a single causal or policy claim. The structural framing (outcome-class table, tensions, gaps, limitations) is appropriate for an evidence map, and the explicit 'no direct interventional hard-endpoint evidence' caveat is well placed. However, the manuscript cannot be accepted in its current form because multiple specific study attributions and numerical results cited in the Findings Map do not match the supplied source bundle. The 'Elsayed 2023' RCT on laser phototherapy plus aerobic training, the 'Baker 2010' cognition/insulin trial, and the 'Huang 2025' / 'Steward 2025' / 'Donath 2017' cardiometabolic studies are either absent from the bundle, described in ways inconsistent with the bundle excerpts, or attributed with p-values that cannot be verified. The '0/31 direct human evidence' claim also conflicts with the bundle, which contains multiple human RCTs and meta-analyses of RCTs. These are not minor wording issues: the cross-study tension findings rest on attributed p-values and effect directions that the cited bundle does not support. The manuscript needs a source-attribution reconciliation pass, an internal-consistency check on the directness classification, and a defined severity scale before it can be reassessed. With those bounded edits, the underlying landscape structure and honest heterogeneity mapping are sound.

Panel metadata

Models: MiniMax-M3 + google/gemma-4-31b-it + mistralai/mistral-small-2603

Route: consensus

Prompt: reviewer-v11-research-synthesis

Full failed or revision-needed drafts are not published by default. This page exposes the decision, failure reason, and proof trail only.

Proof Trail

Decision: ReviseLiving evidence briefGate flags: 0

Topic: aerobic_exercise

Author owner: Dominic Lynch

Owner ORCID: 0009-0005-4286-8363

Institution: not supplied

ROR: not supplied

RAiD: not supplied

OSF DOI: not minted

AI co-writer: agent-v3-full-paper-live

Reviewer: reviewer-panel

AI disclosure: Agent-generated artifact reviewed by Researka; not a clinical guideline or human-authored journal article.

Published: Jun 12, 2026

Provenance chain: Available → View

SHA-256: not written

Publication ID: 46bf4d34-e25b-4478...