Red-Teaming Your Own Draft: LLM Prompts That Find Weak Claims
Red-Teaming Your Own Draft: LLM Prompts That Find Weak Claims
We all have blind spots. When I was drafting a recent manuscript, a set of "hostile-reviewer" prompts caught four overreaching claims that I would otherwise have shipped straight to the journal. Those prompts now live in my Claude Project for every single paper I write.
Red-teaming a manuscript means deliberately asking an AI to look for failure modes — not to validate your writing, but to attack it. The goal is to surface the weak points before Reviewer 2 does. This post is a spoke in a larger workflow: the full 5-step pre-submission audit is covered in Self-Peer-Review with AI. Here I'm going deeper on the specific hostile prompts that do the most damage to soft claims.
Why Most Authors Self-Review Ineffectively
The problem with self-review is familiarity. You know what you meant to say, so you read what you meant — not what's actually there. Reviewers don't have that context. They read exactly what you wrote, with no charitable interpretations.
LLMs, paradoxically, are good at simulating this outsider perspective when prompted correctly. They don't know your study. They can't fill in the gaps from prior knowledge of your research context. When you tell them to look for weaknesses, they apply the same critical lens a reviewer would — without the fatigue or the social politeness that makes human colleagues soften their feedback.
The key is in how you frame the prompt. "Review my paper" gives you editing assistance. "Attack my paper as a hostile reviewer" gives you adversarial feedback.
Here are the five prompts that have proven most useful for me across multiple manuscript cycles.
Prompt 1: The Claim-Evidence Gap
This prompt finds the most common reviewer complaint: conclusions that go beyond the data.
"Extract every causal claim made in the Discussion. For each claim, find the specific p-value or effect size in the Results section that supports it. If a claim is not directly supported by a quantitative result, flag it as 'Speculative' and quote the exact sentence."
What you get back is a two-column mapping: claim on the left, evidence on the right. Everything flagged "Speculative" is a liability. Either add the evidence, soften the claim, or move it to the Limitations section.
In practice, this prompt catches the "we found X therefore Y probably happens" pattern that feels logical when you're writing but looks like a leap to an external reader.
Prompt 2: The Overreach Audit
Generalizability is the second most common reviewer target after statistics.
"Acting as a cynical peer reviewer, identify three sentences in this manuscript where the authors generalize their findings beyond the study population or the specific parameters measured. Suggest a more conservative phrasing for each."
This works best on the Introduction and Discussion. Common outputs:
- "The findings suggest all patients with X should receive Y" → flagged if study is single-center
- "Our results demonstrate that Z is effective" → flagged if sample is <50
- "This approach can be applied broadly to settings including..." → flagged if validation is limited
For each suggestion, decide whether to adopt the conservative phrasing or add explicit scope language ("in the context of our study population, which consists of...").
Prompt 3: The Missing Caveat Scan
Reviewers read the Limitations section looking for what you didn't say. This prompt checks whether you said it.
"Scan the Discussion for findings presented without limitations. For every significant result, identify one potential source of bias (selection, measurement, or confounding) that isn't addressed in the 'Limitations' section. Rate the severity of each omission: HIGH (likely to cause a revision request), MEDIUM, or LOW."
The output creates a second pass through your Limitations section. HIGH-severity omissions should be addressed explicitly. MEDIUM omissions can often be dispatched in a single sentence. LOW omissions are acknowledgment gaps you can usually leave as-is given word count constraints.
This prompt has saved me from at least three "insufficient discussion of limitations" reviewer comments.
Prompt 4: The Mechanism Handwave
This one is less common but catches a specific failure mode in medical and biological research: explaining a result by gesturing at a mechanism you haven't actually studied.
"Identify any section where a 'biological mechanism' or 'plausible explanation' is offered. Does the manuscript provide evidence for this mechanism, or is it citing other papers to fill a gap? Flag these as 'Mechanistic Handwaving' and quote the relevant sentence."
The classic form is: "This may be due to the activation of pathway X, consistent with prior work showing Y (citation)." If your study didn't measure pathway X, this is speculative. Reviewers at journals like JCI or NEJM will immediately flag it. Either test the mechanism, cut the sentence, or explicitly label it as a hypothesis for future work.
Prompt 5: The Unfalsifiable Framing
This is the hardest to catch through self-review because unfalsifiable claims feel profound when you write them.
"Are there any claims in this paper that are framed such that no result from this study could have disproven them? Identify unfalsifiable statements and suggest how to reframe them into testable hypotheses."
Examples that come back from this prompt:
- "AI will transform how we conduct research" — no finding from a single study could disprove this
- "Our framework provides a foundation for future work" — vacuously true of any paper
- "These findings highlight the importance of X" — importance is not measured anywhere
The fix is usually to replace the unfalsifiable statement with a testable prediction: "Based on our findings, we predict that [specific intervention] will [specific outcome] in [specific population]. This remains to be tested in prospective trials."
Running All Five in Sequence
The right order matters. Run Prompt 1 (Claim-Evidence Gap) before Prompt 2 (Overreach) — if you're going to cut a claim entirely, there's no point auditing whether it's overgeneralized.
After running all five, you'll have a list of flagged items. Triage them: fix the HIGH-severity issues, decide on MEDIUM items, and ignore LOW if you're on a deadline.
This entire red-teaming session takes 45–90 minutes for a typical 3,000-word manuscript. That's a small investment compared to a 3-month revision cycle or a desk rejection.
The Full Prompt Library
These five prompts are drawn from the Prompt Pack: Paper Structuring — 25 tested prompts across the full manuscript workflow, from structuring your Introduction to responding to reviewers. If you're moving a paper from draft to submission and want the prompts already organized by section, the pack is at researchcraft.gumroad.com/l/xrjeei.
Red-teaming your draft is uncomfortable. It's much better to hear these critiques from Claude than from an editor six months later. Run the prompts, fix the issues, and submit with confidence.