Explainer · Plain-language

What Is Sampling in Research? Types, Methods & Examples

Sampling in research is the process of selecting a subset of individuals, cases or units from a larger population to study. Because studying entire populations is usually impractical, sampling allows researchers to draw conclusions about the whole from a manageable part.

CASRAI plain-language explainers — clear answers to recurring research-administration questions

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.

Probability sampling: when every unit has a known chance

Probability sampling methods give each unit in the sampling frame a known, non-zero probability of selection, enabling statistical inference from the sample to the population with a calculable margin of error. Simple random sampling assigns each unit an equal probability; systematic random sampling selects every nth unit from an ordered list. Stratified sampling divides the population into subgroups (strata — e.g., age bands, institution types) and samples randomly within each, ensuring representation of key subgroups. Cluster sampling selects geographic or organisational clusters (e.g., schools) and studies all or a random sample of units within them — more practical when the population is geographically dispersed. Multi-stage sampling combines these methods (e.g., random selection of regions, then schools within regions, then pupils within schools).

Non-probability sampling: when random selection is not the goal

Non-probability sampling is used when probability sampling is impractical (no complete sampling frame exists), when the research is exploratory or qualitative, or when theoretical rather than statistical representativeness is sought. Convenience (accidental) sampling selects whoever is available — fast but prone to bias. Purposive (judgement) sampling deliberately selects cases that are information-rich for the research question. Snowball sampling asks existing participants to recruit others from their networks — particularly useful for hard-to-reach populations (e.g., people with stigmatised conditions, practitioners in specialised roles). Quota sampling sets target numbers for specified subgroups. Theoretical sampling (Glaser & Strauss, grounded theory) selects cases based on emerging theory, sampling until no new conceptual insights emerge (theoretical saturation).

Sample size: statistical and qualitative considerations

In quantitative research, sample size is determined by a power calculation: the minimum number of participants needed to detect a true effect of a specified size at a given α level with desired power (commonly 0.80). The key inputs are the expected effect size, the population variance, α (usually 0.05), and desired power. Increasing sample size reduces standard errors and increases power. Sampling error — the difference between a sample estimate and the true population value — decreases with the square root of sample size, so doubling precision requires quadrupling the sample. In qualitative research, sample size is guided by saturation: Glaser and Strauss’s concept that sampling continues until additional data produce no new theoretical insights. Typical qualitative samples are small (10–40 for interview-based studies) because depth, not breadth, is the goal.

Sampling bias, representativeness and the census alternative

Sampling bias occurs when the sampling procedure systematically excludes or over-represents certain units, producing a sample that is unrepresentative of the target population. Famous examples include the 1936 Literary Digest poll (self-selected telephone/automobile owners over-represented Republicans, predicting a Landon landslide against Roosevelt) and online surveys (exclude people without internet access). Representativeness matters for generalisability: probability samples from a well-defined sampling frame are more likely to produce representative samples than convenience samples. A census attempts to include every unit in the population, eliminating sampling error but incurring much higher data-collection cost and logistical complexity.

Key facts

At a glance

Definition: Selecting a subset from a population for study
Two branches: Probability (random, known chance) vs non-probability sampling
Probability: Simple random, systematic, stratified, cluster, multi-stage
Non-probability: Convenience, purposive, snowball, quota, theoretical (grounded theory)
Sample size: Determined by power calculation (quantitative) or saturation (qualitative)
Bias: Systematic exclusion or over-representation — threatens generalisability
Saturation: Glaser & Strauss — point where no new themes emerge from sampling

Common misconceptions

What people often get wrong

Often heard: A larger sample is always better.

Actually: Not always — a large biased sample is worse than a smaller representative one. The Literary Digest poll (n = 2.4 million) famously failed due to systematic bias; George Gallup predicted the 1936 election correctly with n ≈ 50,000 using quota sampling.

Often heard: Qualitative research should use random sampling.

Actually: Not typically — qualitative research uses purposive or theoretical sampling to select information-rich cases relevant to the research question, not to achieve statistical representativeness. The goal is theoretical insight, not statistical generalisation.

Often heard: Sampling error can be eliminated by using a large enough sample.

Actually: No — sampling error can be reduced but not eliminated (unless you use a census). With probability sampling, sampling error can be quantified and reported as a margin of error; with non-probability sampling, sampling error cannot be calculated.

Going deeper