When a Study Is Too Small to Matter

When a Study Is Too Small to Matter

Sample size justification is one of the most routinely performed and least seriously considered parts of research planning. Researchers calculate what’s required, hit the number, and move on. The question of whether the study can actually answer the question it claims to address often goes unexamined.

The consequence is a category of published research that is technically complete but scientifically inadequate: studies that were large enough to recruit, small enough to be underpowered, and published because they were novel. Reviewers flag these studies. Editors accept them anyway. The literature accumulates evidence that can’t be meaningfully synthesized.

Recognizing when a study is too small to matter—before submission, ideally before data collection—is one of the more honest judgments a researcher can make.

What Underpowered Studies Actually Mean

An underpowered study doesn’t just have wider confidence intervals. It produces a systematic distortion.

When a study is underpowered, statistically significant results are more likely to be inflated estimates than accurate ones. The studies that “work”—that cross the significance threshold—are the ones where chance produced an effect large enough to survive the noise. This is the winner’s curse in clinical research: publish the positive result from a small trial, and you’ve almost certainly published an overestimate.

Conversely, a non-significant result from an underpowered study is uninformative. It doesn’t mean there’s no effect. It means the study couldn’t reliably detect one. When these studies enter systematic reviews, they inflate heterogeneity and complicate interpretation.

Why Good Studies Still Get Rejected covers the structural reasons editors flag small trials—it’s not only about statistical power, but about what editors can credibly defend to their readers. A study whose primary finding is a confidence interval spanning both clinical significance and no effect is difficult to position as a contribution.

The Decision Point: When to Pivot to a Pilot

The calculation that most researchers miss is this: a study that’s too small to be definitive is not automatically worthless. The question is whether it’s being framed as the right kind of study.

A weak RCT and a well-designed pilot study can use similar sample sizes. The difference is not in the data collected—it’s in the claims being made.

A weak RCT tries to answer a clinical question and fails. A well-designed pilot study is not trying to answer the clinical question. It’s trying to establish whether a definitive study is feasible: whether the intervention can be delivered consistently, whether outcome measures perform as expected, whether recruitment is viable, whether the effect size estimate is plausible enough to power a future trial.

These are answerable questions for a small study. Definitive efficacy is not.

The pivot from weak RCT to pilot happens when the honest answer to “what can this study actually establish?” is not “efficacy” but “feasibility and preliminary estimation.” If that’s the truth of what the study can do, framing it as a pilot is not a consolation. It’s the accurate characterization.

What Reviewers Are Actually Asking

When reviewers write comments like “the sample size is insufficient to draw conclusions” or “this study is severely underpowered,” they are making one of two distinct points that warrant different responses.

The first is that the study was designed as a definitive trial and the sample size fails to support that design. In this case, the response isn’t to reanalyze—it’s to reframe. The findings become hypothesis-generating, the conclusions become appropriately hedged, and the discussion focuses on what the data can and cannot support.

The second is that the study was designed as a pilot but is making claims that exceed pilot-level evidence. This is the inverse problem: an underclaim on design but overclaim on interpretation. The fix is to tighten the conclusions rather than the design.

Both problems have the same root: a mismatch between what the study was built to do and what it’s claiming to have done. The Academic Publishing Game Nobody Explains describes how editors read for this kind of mismatch early in the evaluation process—it signals that the authors either don’t understand their own study or are hoping reviewers won’t notice.

Sample Size Justification That Actually Holds Up

A defensible sample size calculation is not just the arithmetic. It’s the chain of reasoning behind it.

The assumptions in a power calculation—the expected effect size, the variance, the minimally clinically important difference—each require a source. Using assumptions from a different population, a different measurement instrument, or an optimistic pilot estimate produces a number that looks rigorous and isn’t. Reviewers who know the literature notice when the assumed effect size doesn’t match what’s been observed in comparable trials.

The more honest approach: state the assumptions explicitly, cite where they came from, and acknowledge if they’re uncertain. If the assumed effect size is based on sparse preliminary data, say so. An honest sample size section that acknowledges uncertainty is more defensible than a precise calculation built on questionable assumptions.

If the honest calculation produces a required sample size you cannot recruit, that’s not a problem with the calculation. It’s information about what kind of study is actually feasible.

The Value of Knowing What a Study Can’t Do

A small study that accurately represents what it can establish contributes something: a precise question for a future definitive trial, a validated protocol, an effect size estimate with honest uncertainty bounds.

A small study that claims more than its design supports contaminates the literature. It gives meta-analysts data they can’t trust and gives practitioners conclusions that may not hold.

Recognizing early—ideally before data collection—that a study is too small to answer the question it’s asking is not a failure. It’s the first step toward designing one that can.

Use AI in Research — The Right Way

Get practical insights on using AI in academic research and receive a free PDF guide.

Tuyen Tran

Tuyen Tran

Pediatric surgeon and independent clinical researcher. I write about how real clinical research actually works — built from real manuscripts, real mistakes, and AI used deliberately as a thinking tool. More about me