Research MethodologyJune 29, 2026

STROBE/CONSORT Compliance via LLM Audit: Catching Reporting Gaps Before Reviewers Do

STROBE compliance AI checks are mechanical work — and mechanical work is where LLMs earn their keep. I caught a missing CONSORT item 3a (protocol changes after trial initiation) during JIAPS R2 revision. Claude flagged it in about 90 seconds using the right prompt structure. It would have slipped through a manual read for the fourth time.

The key word is "structure." An unscoped LLM audit produces confident summaries that miss half the checklist. The template below fixes that.

Why LLM Audits Work for Checklists

Reporting checklists are pattern-matching tasks: does item X appear somewhere in section Y, stated with sufficient specificity? That's a retrieval-and-classification problem, and LLMs handle it well when you scope it correctly.

What they do poorly: inferring whether a vague statement actually meets the standard. "Participants were randomized" passes pattern-matching as CONSORT item 8a (sequence generation). It does not pass a genuine audit that asks how the sequence was generated. The prompt template below separates these two questions — detecting presence versus evaluating specificity.

The Audit Prompt Template

Feed the checklist and manuscript separately. Never paste both in the same instruction block — the model conflates them.

Step 1 — Detection:

"Here is the CONSORT 2010 checklist: [paste checklist items]. Here is my manuscript: [paste]. For each checklist item, identify the line or paragraph where that item appears. If it does not appear, say 'MISSING.' Do not infer or assume — cite the actual text or mark missing."

Step 2 — Specificity check on flagged items:

"For each item you marked as present, quote the exact sentence that covers it. Then rate: does this sentence state the information specifically enough for a reader to reproduce the method? Yes / No / Partial. Explain briefly for any Partial or No."

Two separate prompts, run sequentially. Step 1 builds the presence map; step 2 evaluates quality on positives. The CONSORT 3a item I missed appeared as "MISSING" in step 1 — I had simply never written the sentence.

STROBE vs CONSORT: Picking the Right Checklist

This matters for scoping the prompt correctly:

CONSORT 2010 — randomized controlled trials. Use for any parallel-group, crossover, or cluster RCT.
STROBE — observational studies (cohort, case-control, cross-sectional). Version selection: STROBE-2007 is the current main guidance; extension versions (STROBE-MR for Mendelian randomization, STROBE-NutriCohorte for nutrition cohorts) exist for specific designs.
When in doubt, check the journal's Instructions for Authors — most clinical journals specify which checklist they require and whether completion is mandatory or recommended.

Running the wrong checklist wastes time and produces false reassurance. The LLM won't tell you that you fed it the wrong checklist; it'll just audit against what you gave it.

What the LLM Misses

Three things the audit template cannot catch:

Statistical assumption adequacy. CONSORT 12a requires sample size rationale. The model will find the sentence; it cannot evaluate whether the rationale is methodologically sound (wrong ICC, wrong power).
Figure and table completeness. LLMs auditing text don't read figures unless you upload them. CONSORT 13a (participant flow) and STROBE's outcome-reporting items often live in tables and flow diagrams — check those manually.
Cross-section consistency. A claim in the abstract that contradicts the results section is not a checklist item, but it will get a reviewer's attention. The LLM audit flags individual items, not cross-section logic.

Run the LLM audit for the mechanical layer. Reserve your final read for the judgment layer.

One More Use: The Submission Checklist for R2

When you're responding to reviewers and revising, the checklist audit catches drift — items that were compliant in v1 and broke in v2 because you restructured a section. Running the template on the revised manuscript before submission takes 10 minutes and has caught gaps in two of my revision rounds.

The Checklist: Idea to Submission bundles the CONSORT and STROBE audit templates alongside the full reporting prompt library — including STROBE extension variants and the step-2 specificity rubric. If you run checklists repeatedly across papers, the prompt structure is worth having in a reusable format rather than rebuilding from scratch each time. More on the systematic review side of this in the AI systematic review workflow and in the 10 Claude prompts I use weekly for paper writing.