Guide

Data collection methods

Data collection methods are the systematic techniques researchers use to gather information — spanning primary and secondary sources, and quantitative and qualitative approaches.

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.

Primary versus secondary data

The first decision is whether to collect new data yourself or reuse data that already exists. Primary data is gathered directly by the researcher for the specific study — through surveys, experiments, interviews or observation — giving full control over what is measured and how, at the cost of time and resources. Secondary data is existing information collected by someone else for another purpose, such as government statistics, published datasets, administrative records or prior research. Secondary data is faster and cheaper and enables large-scale or historical analysis, but you inherit the original definitions, quality and gaps, and the data may not fit your question precisely. Many studies combine both, using secondary data for context and primary data for the specific gap.

Quantitative methods

Quantitative methods gather numerical data that can be counted, measured and analysed statistically, and are typically used to test hypotheses and quantify relationships. Surveys and questionnaires with closed (fixed-response) items collect standardised data from large samples efficiently, supporting generalisation when sampling is sound. Experiments manipulate an independent variable under controlled conditions and measure the effect on a dependent variable, making them the strongest design for causal inference. Structured observation records the frequency or duration of predefined behaviours against a coding scheme. The strength of quantitative methods is precision, comparability and statistical power; the trade-off is that they capture predetermined variables and can miss meaning, context and the unexpected.

Qualitative methods

Qualitative methods gather non-numerical data — words, images and meanings — to explore experiences, processes and the reasons behind behaviour, and are well suited to exploratory and theory-building research. In-depth interviews allow detailed, flexible exploration of individual perspectives, with semi-structured formats balancing consistency and openness. Focus groups bring several participants together so that interaction surfaces shared and contested views. Ethnography and participant observation immerse the researcher in a setting over time to understand behaviour in its natural context. Document analysis interprets existing texts, records and artefacts. Qualitative methods yield rich, contextual insight, but typically use smaller samples, depend on careful interpretation, and aim for transferability rather than statistical generalisation.

How to choose a method

The method should follow the research question, not the other way around. Questions about "how many", "how much" or "is there a difference" call for quantitative methods; questions about "how", "why" or "what does this mean" call for qualitative methods. Then weigh practical constraints: the population and how to reach it, available time and budget, the access and skills required, and ethical considerations such as consent and confidentiality. The unit and level of analysis, and whether you need to generalise to a population or understand a case in depth, also steer the choice. Mixed-methods designs deliberately combine both to offset each approach’s weaknesses, for example using a survey to map a pattern and interviews to explain it.

Validity and reliability

Whatever method you use, the data must be trustworthy. Reliability concerns consistency — whether the method produces the same results under the same conditions, across time, raters and items. Validity concerns accuracy — whether the method actually measures what it claims to, covering construct, internal and external validity. In quantitative work these are assessed statistically (for example, test–retest reliability and inter-rater agreement) and protected by standardised instruments, piloting and clear operational definitions. In qualitative work the parallel concern is trustworthiness, addressed through credibility, dependability and confirmability, using techniques such as triangulation, member checking and a transparent audit trail. Poor measurement undermines every later step, so safeguarding validity and reliability is part of choosing and applying any data collection method.

Key facts

At a glance

Definition: systematic techniques for gathering data to answer a research question
First split: primary (new data) versus secondary (existing data)
Second split: quantitative (numbers) versus qualitative (words, meanings)
Quantitative: surveys, experiments, structured observation
Qualitative: interviews, focus groups, ethnography, document analysis
Always check: validity (accuracy) and reliability (consistency)

Common questions

FAQ

What is the difference between quantitative and qualitative data collection?+

Quantitative methods gather numerical data to count, measure and test relationships statistically, using tools such as closed-question surveys and experiments. Qualitative methods gather non-numerical data — words, images and meanings — through interviews, focus groups and observation to explore experiences and reasons. Quantitative work prioritises measurement and generalisation; qualitative work prioritises depth and context. Mixed-methods research combines both.

What is the difference between primary and secondary data?+

Primary data is collected first-hand by the researcher for the specific study, giving control over what is measured but requiring time and resources. Secondary data already exists — collected by others for a different purpose, such as official statistics or published datasets — so it is faster and cheaper but may not fit the question precisely and carries the original collectors’ definitions and limitations.

How do I choose the right data collection method?+

Start from your research question: "how many" and "is there an effect" questions point to quantitative methods, while "how" and "why" questions point to qualitative methods. Then weigh your population, access, time, budget and ethics, and whether you need to generalise or understand a case in depth. A mixed-methods design can combine approaches to offset each one’s weaknesses.

Going deeper