Tag: research lifecycle

  • Systematic Review vs Meta-Analysis: The Difference Explained

    A systematic review is a structured, protocol-driven synthesis that identifies, appraises and summarises all studies meeting pre-specified criteria. A meta-analysis is an optional statistical step within or after such a review that pools the numerical results into a single combined estimate. Every meta-analysis should rest on a systematic review, but not every systematic review contains a meta-analysis.

    Two related but distinct things

    The terms are often used interchangeably, which causes real confusion. The systematic review is the method: a comprehensive search, transparent selection, risk-of-bias assessment and synthesis. The meta-analysis is one possible synthesis technique — combining effect estimates statistically to gain precision and to quantify how consistent the studies are. A review may instead use a narrative or structured qualitative synthesis when pooling is not appropriate.

    Feature Systematic review Meta-analysis
    What it is Structured synthesis of all eligible studies Statistical pooling of study results
    Output Narrative or quantitative summary, evidence tables Combined effect estimate with confidence interval
    Always needs the other? No — can stand alone Yes — should rest on a systematic review
    Key risk Incomplete or biased search Pooling heterogeneous or incomparable studies
    Visual artefact PRISMA flow diagram, evidence tables Forest plot, funnel plot

    PRISMA reporting underpins both

    Whether or not pooling occurs, the review should be reported to the PRISMA 2020 standard. PRISMA’s checklist and flow diagram make the search, selection and synthesis auditable. When a meta-analysis is performed, PRISMA additionally expects the synthesis methods, the model used, and the handling of heterogeneity and certainty to be reported.

    Heterogeneity: the decisive question

    The central judgement in any meta-analysis is whether the studies are similar enough to combine. Heterogeneity describes the variability in true effects across studies, beyond what chance alone would produce. Reviewers assess it visually and with statistics such as the I² and the χ² test for heterogeneity. High heterogeneity warns that a single pooled number may be misleading — combining apples and oranges produces a fruit salad, not an average. Where studies differ in populations, interventions or outcomes, a random-effects model, subgroup analysis or a decision not to pool at all may be the honest choice.

    Forest plots and reading the result

    The signature output of a meta-analysis is the forest plot. Each study appears as a point estimate with a confidence interval, sized by its weight, and the pooled estimate sits at the bottom, often as a diamond whose width is its confidence interval. A funnel plot, meanwhile, is used to inspect for small-study effects and possible publication bias. These plots are how readers see, at a glance, both the central estimate and the spread of the evidence behind it.

    When meta-analysis is — and isn’t — appropriate

    Pooling is appropriate when studies ask a comparable question, measure comparable outcomes and are methodologically sound enough that a combined estimate is meaningful. It is inappropriate when heterogeneity is high and unexplained, when studies are at high risk of bias, or when outcomes are not commensurable. In those cases a rigorous systematic review with a narrative synthesis is the stronger contribution. For more, see our research-lifecycle coverage, the CASRAI dictionary, and how reviews fit the hierarchy of evidence.

    Frequently asked questions

    Is a meta-analysis always better than a systematic review?

    No. A meta-analysis adds statistical precision only when the underlying studies are comparable and sound. Pooling heterogeneous or biased studies produces a precise but misleading number. A careful systematic review without pooling is often the more honest result.

    What does heterogeneity tell me?

    It tells you how much the true effects vary across studies beyond chance. High, unexplained heterogeneity is a signal to investigate sources of variation and to question whether a single pooled estimate is meaningful.

    What is a forest plot?

    A forest plot displays each study’s effect estimate and confidence interval alongside the pooled result, letting readers see both the combined estimate and the consistency of the evidence at a glance.

    Do both follow the same reporting standard?

    Yes. Both follow PRISMA 2020, with extra synthesis items reported when a meta-analysis is conducted. See our author guidance for preparing a compliant manuscript.

  • Evidence-Based Medicine and the Hierarchy of Evidence

    Evidence-based medicine (EBM) is, in Sackett’s classic definition, the conscientious, explicit and judicious use of current best evidence in making decisions, integrating individual expertise with the best available external evidence and with the values and preferences of the people affected. It is a methodology framing for appraising and applying research — not clinical advice in itself.

    The three pillars of EBM

    EBM rests on three components that must be combined, never used in isolation:

    • Best available evidence — the findings of well-conducted research, appraised for validity and relevance.
    • Expertise — the judgement and experience that interpret evidence in context.
    • Values and preferences — what matters to the individual or population, including acceptability and burden.

    Evidence alone does not make a decision. A high-certainty finding may still be the wrong choice when it conflicts with the preferences of the people involved. EBM is the disciplined act of weighing all three.

    The hierarchy of evidence

    Not all study designs answer a question with equal confidence. The hierarchy of evidence — often drawn as a pyramid — ranks designs by their typical resistance to bias for questions about effectiveness. Higher tiers offer stronger protection against confounding and chance, though the hierarchy is a heuristic, not an absolute law.

    Tier (strongest first) Design Why it ranks there
    1 Systematic reviews & meta-analyses Synthesise all eligible studies; reduce single-study chance
    2 Randomised controlled trials Randomisation balances known and unknown confounders
    3 Cohort studies Follow groups over time; vulnerable to confounding
    4 Case-control studies Compare exposed and unexposed retrospectively; recall bias
    5 Case series & reports Describe without comparison; hypothesis-generating
    6 Expert opinion Useful where evidence is absent; lowest protection from bias

    A well-conducted systematic review of randomised trials sits at the top because it combines the internal validity of randomisation with the breadth of synthesis. Expert opinion sits at the bottom not because it is worthless — it is indispensable where evidence is thin — but because it offers the least protection against bias.

    Beyond the pyramid: GRADE and certainty

    The simple pyramid has a known limitation: a sloppy randomised trial can be weaker than a rigorous cohort study. Modern EBM therefore separates the design from the certainty of the evidence. The GRADE approach (Grading of Recommendations, Assessment, Development and Evaluations) rates certainty as high, moderate, low or very low. It starts from the design but can downgrade for risk of bias, inconsistency, indirectness, imprecision and publication bias — and occasionally upgrade observational evidence for a large effect or a dose-response gradient. GRADE is why a review can begin with randomised trials yet still report only low-certainty evidence.

    Putting EBM into practice

    The familiar EBM cycle is: ask a focused question, acquire the best evidence, appraise it critically, apply it in context, and assess the outcome. Reporting standards underpin each step — PRISMA for the reviews at the top of the hierarchy and CONSORT for the trials beneath them. For definitions of these terms see the CASRAI dictionary and our research-lifecycle coverage.

    Frequently asked questions

    Who coined evidence-based medicine?

    The term was popularised by the group associated with David Sackett, whose definition — the conscientious, explicit and judicious use of current best evidence — remains the standard reference. EBM grew from earlier work on clinical epidemiology and critical appraisal.

    Does the hierarchy mean expert opinion is useless?

    No. Expert opinion is essential where higher-tier evidence does not exist and provides the judgement that interprets evidence. The hierarchy ranks designs by susceptibility to bias for effectiveness questions, not by overall value.

    How does GRADE differ from the pyramid?

    The pyramid ranks study designs; GRADE rates the certainty of a body of evidence for a specific question. GRADE can downgrade strong designs for limitations or upgrade observational evidence, giving a more nuanced verdict than design alone.

    How does EBM relate to systematic reviews?

    Systematic reviews and meta-analyses sit at the top of the hierarchy because they synthesise eligible studies into the best available answer. See our explainer on systematic reviews versus meta-analyses and our author guidance.

  • What Is Research? Meaning, Types and the Research Lifecycle

    Research is a systematic process of investigation undertaken to discover new knowledge, confirm or revise existing understanding, and answer questions that have not yet been adequately resolved. What separates research from casual enquiry is its systematic character: it follows a planned, transparent method, gathers evidence deliberately, and subjects its conclusions to scrutiny. The result is intended to be reliable knowledge that others can examine, build upon and, ideally, reproduce.

    Basic and applied research

    Research is commonly divided into two broad orientations. Basic research, sometimes called fundamental or pure research, seeks to expand understanding for its own sake, without a specific application in mind. Investigating how a protein folds or why a mathematical relationship holds is basic research. Applied research addresses a particular practical problem: developing a treatment, improving a manufacturing process or evaluating a policy. The two are not rivals but a continuum, and basic findings frequently enable later applied advances.

    Type Primary aim Example
    Basic research Advance fundamental understanding Studying the mechanism of cell division
    Applied research Solve a defined practical problem Testing a new drug to prevent a disease

    The research lifecycle

    Most research, whatever its field, moves through a recognisable sequence of stages often described as the research lifecycle. While disciplines differ in detail, the lifecycle gives a shared vocabulary for the work and for the data and outputs it produces.

    Stage What happens
    Question Identify a gap and frame a clear, answerable research question or hypothesis
    Design Choose methods, plan sampling and analysis, address ethics and feasibility
    Data Collect, manage and document data according to the plan
    Analysis Interpret the evidence using appropriate methods and statistics
    Dissemination Report findings through publications, datasets and other shared outputs

    The lifecycle is iterative rather than strictly linear. Analysis often raises new questions, and dissemination feeds the next cycle of enquiry. Crucially, each stage generates information, about methods, samples, instruments and results, that needs to be described consistently so others can understand and reuse it.

    Analysis and the role of statistics

    The analysis stage is where evidence becomes findings. In quantitative research this typically draws on statistics, using descriptive summaries to characterise data and inferential methods to generalise responsibly. Careful analysis distinguishes signal from noise, reports uncertainty honestly through measures such as confidence intervals, and resists over-interpreting chance patterns. Weak analysis is a recognised threat to the trustworthiness of the resulting knowledge.

    Reproducibility and the scholarly record

    Research only contributes durable knowledge if its claims can be checked. Reproducibility, the ability of others to obtain consistent results using the same data and methods, depends on transparent reporting of every lifecycle stage. The scholarly record, the accumulated and citable body of publications, datasets and metadata, is the lasting product of research. CASRAI’s mission is to standardise the terminology used to describe research activities and outputs, which directly supports clearer reporting and reuse. Explore the CASRAI dictionary, the research lifecycle category and the author guidance for related resources.

    Frequently asked questions

    What makes an activity count as research?

    Research is distinguished by being systematic, methodical and aimed at producing generalisable or transferable knowledge. A planned investigation with documented methods and conclusions open to scrutiny qualifies; an unstructured opinion does not.

    Is the research lifecycle the same in every field?

    The broad stages, question, design, data, analysis and dissemination, are common across disciplines, but the methods within each stage vary widely. A laboratory experiment, a clinical trial and an archival history study share the lifecycle shape while differing in technique.

    How does CASRAI relate to research?

    CASRAI develops shared, standardised vocabularies for describing the people, activities and outputs of research. Consistent terminology across the lifecycle makes outputs easier to find, compare, reuse and reproduce, strengthening the scholarly record as a whole.

  • Double-Blind Studies and Bias Control

    A double-blind study is a controlled trial in which neither the participants nor the researchers who deliver the intervention and assess outcomes know who has been assigned to which group. By concealing allocation from both sides, the design neutralises the conscious and unconscious expectations that would otherwise distort behaviour, treatment and measurement, making it a cornerstone of unbiased and credible causal research.

    Blinding works alongside randomisation. Where randomisation balances groups at the start, blinding keeps them comparable thereafter by preventing knowledge of assignment from influencing what happens next. The two are complementary pillars of the randomised controlled trial, and a study that randomises well but fails to blind can still be undermined by the expectations of those involved.

    The biases that blinding controls

    Different biases threaten a study at different points in its lifecycle. Blinding and allied safeguards each target a specific threat.

    Bias Where it arises Primary safeguard
    Selection bias At group assignment Randomisation and allocation concealment
    Performance bias During the intervention Blinding of participants and care providers
    Detection bias At outcome measurement Blinding of outcome assessors
    Attrition bias From dropout and missing data Intention-to-treat analysis, follow-up

    Selection bias occurs when groups differ systematically before treatment begins; it is addressed not by blinding but by randomisation and concealment. Performance bias arises when one group receives different co-interventions or attention because their assignment is known. Detection bias creeps in when those measuring outcomes are influenced by knowing who received what — especially for subjective endpoints. Attrition bias emerges when dropout differs between groups, which is why retention and intention-to-treat analysis matter.

    Why each bias matters

    It is worth understanding why these biases are so damaging. Performance bias inflates or deflates an apparent effect because one group is, in practice, treated differently — perhaps receiving more attention, additional co-interventions or subtly different care — purely because their assignment is known. Detection bias is especially insidious with subjective outcomes such as pain, mood or function, where an assessor who knows the assignment may unconsciously rate the treatment group more favourably. Attrition bias distorts results when participants who drop out differ systematically between groups; if those doing poorly on one treatment leave more often, the survivors make that treatment look better than it is. Each bias, left unchecked, can manufacture an effect that is not real or hide one that is.

    Single, double and triple blind

    The number of “blinds” describes how many parties are kept unaware of assignment. In a single-blind design, participants do not know their group, but the researchers do — controlling expectation effects in participants while leaving performance and detection bias on the researcher side unaddressed. A double-blind design conceals assignment from both participants and the clinicians delivering and assessing care, the configuration most associated with rigorous trials. A triple-blind design extends concealment further, typically to the statisticians or committee analysing the data, so that interpretation cannot be skewed by knowledge of group identity. The more parties blinded, the more points in the study where bias is closed off — though additional blinding adds logistical cost and is not always feasible. Most well-conducted trials settle on double-blinding as the practical balance between rigour and feasibility, reserving triple-blinding for contexts where analytic interpretation is especially sensitive to expectation.

    How blinding is achieved in practice

    Blinding is more than an intention; it requires concrete mechanisms. Identical-appearing treatments — matching tablets, capsules or infusions — keep participants unaware of their assignment, while a placebo provides an indistinguishable comparator. Coded packaging, central randomisation systems and independent statisticians who work with masked group labels extend concealment through delivery and analysis. Good trials also test whether blinding actually held, by asking participants and staff to guess their assignment; if guesses are better than chance, unblinding may have crept in and the results must be interpreted with that in mind. In drug trials, achieving an indistinguishable placebo can itself be a substantial design challenge, since taste, appearance and even the absence of expected side effects can betray which arm a participant is in. Where a perfect match is impossible, an active placebo that mimics minor side effects is sometimes used to preserve the masking.

    When blinding fails or breaks

    Blinding can be compromised even when well designed. Distinctive side effects can reveal which treatment a participant received; a dramatic clinical response can tip off an assessor; and emergencies sometimes require deliberate unblinding for safety. Each of these reintroduces the very biases blinding was meant to prevent. The mitigations are practical: rely on objective endpoints where possible, keep outcome assessors separate from those managing side effects, and document any unblinding so that readers can judge its likely effect. The aim is not perfection but transparency about how well masking was maintained.

    When blinding is impossible

    Some interventions cannot be hidden. Surgery, physiotherapy, dietary changes and many behavioural interventions are inherently visible to participants and providers. In these cases, researchers preserve as much rigour as possible by blinding the outcome assessors — particularly for subjective measures — and by using objective endpoints that are harder to influence. The placebo, a classic blinding tool, is discussed in our article on the placebo effect in controlled trials; where no convincing sham is feasible, transparency about the limitation becomes essential. These designs are common across the confirmatory studies described in our overview of the pharmaceutical R&D pipeline.

    Reporting and verification

    Readers can only judge a study’s protection against bias if blinding is reported clearly: who was blinded, how concealment was maintained, and whether it was successful. Reporting guidelines for trials ask authors to state explicitly which parties were masked and to flag any departures, precisely because vague phrases like “double-blind” are sometimes used loosely. This kind of methodological transparency, encouraged in our guidance for authors and across the research lifecycle, lets others assess and reuse the evidence with confidence. Documenting blinding alongside the standardised terminology in the CASRAI dictionary makes a trial’s safeguards legible to replicators and reviewers alike, rather than leaving them to be inferred.

    Frequently asked questions

    What is the difference between single and double blind?

    In a single-blind study only the participants are unaware of their group; in a double-blind study both the participants and the researchers delivering and assessing treatment are kept unaware, controlling a wider set of biases.

    Which bias does double blinding most directly address?

    Double blinding chiefly controls performance and detection bias — the distortions introduced when participants or assessors alter behaviour or judgement because they know who received the intervention.

    Can a study still be valid if blinding is impossible?

    Yes. Where the intervention cannot be masked, blinding the outcome assessors and using objective endpoints preserve much of the protection, provided the limitation is reported honestly.

    How does blinding relate to randomisation?

    Randomisation balances groups at the outset and counters selection bias; blinding keeps them comparable afterwards by preventing knowledge of assignment from influencing treatment and measurement. They work together.

  • The Pharmaceutical Research and Development Pipeline Explained

    The pharmaceutical research and development pipeline is the structured, multi-stage process through which a candidate medicine progresses from initial discovery to an approved, monitored product. It moves through discovery, preclinical evaluation, phased clinical trials, regulatory review and post-market surveillance, with rigorous standards and accumulating data governing the decision to advance or halt at every step.

    The pipeline is best understood not as a guaranteed route but as a sequence of evidentiary gates. Most candidates that enter discovery never reach patients, and attrition is a designed feature rather than a failure: each stage is intended to identify safety or efficacy problems before more participants and resources are committed.

    Discovery and target identification

    Discovery begins with understanding the biology of a disease and identifying a molecular target — a protein, receptor or pathway whose modulation might produce a therapeutic effect. Researchers then screen large libraries of compounds to find “hits”, refine them into “leads” through medicinal chemistry, and characterise how they behave. This stage is heavily data-driven, relying on reproducible assays and well-documented methods. Clear, standardised reporting of these early findings — the kind of metadata discipline catalogued in the CASRAI dictionary — makes downstream reuse and verification possible.

    Preclinical evaluation

    Before any human exposure, candidates undergo preclinical testing in laboratory and animal models to assess pharmacology, toxicology and how the body absorbs, distributes, metabolises and excretes the compound. The aim is to establish a plausible safety margin and a rationale for a starting human dose. Good Laboratory Practice frameworks govern how these studies are conducted and recorded, and the resulting data package supports the application a sponsor must file before clinical testing may begin.

    Clinical phases

    Human testing proceeds through sequential phases, each answering a different question. The structure and oversight of these phases are explored in detail in our guide to clinical trial phases I to IV.

    Phase Primary question Typical focus
    Phase I Is it safe in humans? Safety, tolerability, dose range, pharmacokinetics
    Phase II Does it work, and at what dose? Preliminary efficacy, dose-finding, further safety
    Phase III Is it effective and safe at scale? Confirmatory efficacy versus a comparator, broader safety
    Phase IV How does it perform in routine use? Post-approval surveillance, rare effects, long-term outcomes

    Confirmatory phases typically rely on the randomised controlled trial design, which provides the most robust basis for causal claims about benefit and harm. Comparator arms frequently use a placebo, whose role and ethics are discussed in our piece on the placebo effect.

    Regulatory review

    Once clinical data are assembled, the sponsor submits a marketing authorisation application to a regulator, which assesses quality, safety and efficacy. Reviewers scrutinise the trial designs, statistical analyses and manufacturing controls. Approval is conditional on the totality of evidence supporting a favourable benefit–risk balance for a defined indication and population — not on any single trial in isolation. Regulators may also attach post-approval commitments, such as further studies or restricted use, where uncertainty remains. Because the review weighs an entire evidence package, the credibility of each underlying study — its design, its pre-specified outcomes and its transparent reporting — directly shapes the decision. Weak or selectively reported evidence at any earlier stage can undermine an otherwise promising candidate at this gate.

    Post-market surveillance

    Approval is not the end of the pipeline. Pharmacovigilance systems continuously monitor real-world safety once a medicine reaches large, diverse populations, capturing rare or delayed effects that controlled trials cannot detect. Findings can lead to label changes, restrictions or, occasionally, withdrawal. This continuous-evidence model reflects the wider research lifecycle, in which knowledge is provisional and updated as data accumulate. Phase IV studies, spontaneous adverse-event reporting and large observational databases all feed this stage, and the same standards of structured data and transparent methods that governed the clinical phases continue to determine how reliably real-world signals can be interpreted and acted upon.

    Why the pipeline takes so long and stays uncertain

    The pipeline is long and uncertain because biology is difficult to predict and because each stage deliberately raises the evidential bar. A candidate that looks promising in a laboratory model may behave differently in human physiology; one that is safe at a low dose may show toxicity at a therapeutic one; and an effect seen in a small early study may evaporate in a large confirmatory trial. Rather than treat these surprises as setbacks, the staged design exists precisely to surface them while exposure is still limited.

    Uncertainty also compounds across stages. Because so few discovery candidates survive to preclinical work, and only a fraction of those entering human testing reach approval, the pipeline is best modelled as a funnel of conditional probabilities. This is why sponsors run programmes as portfolios rather than single bets, and why transparent reporting of failures — not only successes — is so valuable to the wider field. We avoid quoting specific cost or duration figures here precisely because they vary enormously by therapeutic area and are frequently misreported; the structural point stands regardless of the numbers.

    Data and standards at each stage

    Each stage produces a distinct evidence package, and the value of that package depends on how well it is structured and documented. Discovery generates assay data and compound characterisation; preclinical work produces toxicology and pharmacokinetic datasets; clinical phases yield protocol-bound outcome data; and post-market surveillance accumulates real-world safety signals. When these are recorded with consistent terminology, persistent identifiers and version-controlled protocols, evidence can be audited, pooled across studies and reused — strengthening regulatory decisions and reproducibility alike.

    Stage Key data produced Standards focus
    Discovery Screening hits, assay and structure data Reproducible methods, metadata
    Preclinical Toxicology, pharmacokinetics Good Laboratory Practice records
    Clinical Protocol-bound outcome data Preregistration, trial governance
    Post-market Real-world safety signals Pharmacovigilance reporting

    The role of standards and data discipline

    At every stage, structured data, consistent terminology and transparent methods determine whether results can be trusted and reused. Persistent identifiers, version-controlled protocols and clear documentation of contributions allow regulators, replicators and downstream researchers to interpret findings correctly. The discipline of specifying analyses in advance — explored in our guide to preregistration and Registered Reports — is increasingly applied to clinical work to keep confirmatory claims honest. Guidance for documenting one’s own contributions to such work is set out in our resources for authors.

    Frequently asked questions

    Why do so few candidates reach approval?

    Attrition is intentional. Each gate is designed to stop unsafe or ineffective candidates early, before larger populations are exposed. The high failure rate reflects the difficulty of predicting human biology from early models, not a flaw in the process.

    What distinguishes preclinical from clinical work?

    Preclinical work occurs in laboratory and animal models to establish a plausible safety case, whereas clinical work involves human participants under regulatory oversight and ethical review.

    Does approval mean a medicine is fully understood?

    No. Approval reflects a favourable benefit–risk judgement on the evidence available at the time. Post-market surveillance continues to refine that picture, sometimes for many years.

    How do standards improve the pipeline?

    Consistent terminology, structured metadata and transparent protocols make data verifiable, reusable and comparable across studies, strengthening regulatory decisions and reproducibility throughout the lifecycle.

  • Randomised Controlled Trials: The Gold Standard Explained

    A randomised controlled trial (RCT) is an experimental study in which participants are allocated to an intervention group or a comparison group purely by chance, so that the only systematic difference between groups is the treatment under test. By combining randomisation, a control or comparison arm and, where possible, blinding, the RCT isolates the effect of an intervention from confounding factors, making it the methodological gold standard for answering causal questions.

    The core insight is simple but powerful: if allocation is genuinely random and groups are large enough, known and unknown confounders are distributed evenly across arms. Any difference in outcome can then be attributed to the intervention rather than to pre-existing differences between participants.

    Randomisation

    Randomisation is the process of assigning participants to groups by chance — for example, by computer-generated sequence. Its purpose is to balance characteristics such as age, severity and unmeasured risk factors across arms, removing selection bias from the comparison. Without it, sicker or healthier participants might cluster in one group, distorting the result.

    Allocation concealment

    Allocation concealment ensures that those enrolling participants cannot foresee or influence which group a person will join. It is distinct from blinding: concealment protects the randomisation process at the point of assignment, whereas blinding operates after assignment. Poor concealment is one of the most consistently demonstrated sources of exaggerated treatment effects.

    Control and comparison

    A control or comparison arm provides the counterfactual — what would have happened without the intervention. Comparators may be a placebo, standard care or an active alternative. The placebo arm in particular controls for expectation effects, a topic explored in our article on the placebo and placebo effect.

    Blinding

    Blinding (or masking) prevents participants, clinicians or assessors from knowing group assignment, reducing conscious and unconscious bias. The mechanics of single, double and triple blinding, and the specific biases they address, are set out in our companion guide to double-blind studies and bias control.

    Intention-to-treat analysis

    Intention-to-treat (ITT) analysis evaluates participants in the groups to which they were randomised, regardless of whether they completed the assigned treatment. This preserves the benefits of randomisation and gives a realistic estimate of effectiveness in practice, where adherence is imperfect. The contrasting per-protocol analysis, which includes only those who followed the protocol, can reintroduce bias and is usually treated as secondary.

    Why the RCT is the gold standard

    For causal questions about whether an intervention works, the RCT’s design controls the main threats to validity in one structure. It sits at the heart of the confirmatory stage of drug development, as described in our overview of the pharmaceutical R&D pipeline, and underpins evidence-based decision-making across the research lifecycle.

    Anatomy of a well-conducted RCT

    A robust trial weaves these elements together rather than relying on any single one. The table below summarises the core components and the threat each addresses.

    Component Purpose Threat addressed
    Randomisation Balance groups by chance Confounding, selection bias
    Allocation concealment Hide upcoming assignment Manipulation of enrolment
    Control arm Provide a counterfactual Mistaking change for effect
    Blinding Conceal group membership Performance and detection bias
    Intention-to-treat Analyse as randomised Attrition and post-hoc selection

    Power, sample size and pre-specification

    Randomisation only balances groups reliably when the sample is large enough, which is why trials specify a target sample size derived from the smallest difference worth detecting. Too small a study may miss a real effect or produce an unstable estimate; an adequately powered one gives the result interpretive weight. Equally important is pre-specifying the primary outcome and analysis plan before the data are seen, so that a single confirmatory test is fixed in advance rather than chosen afterwards. This connects directly to the practice of preregistration and Registered Reports, which protects the trial’s confirmatory status from later analytic flexibility.

    Where the RCT sits in the evidence hierarchy

    A single trial, however well conducted, is rarely the final word. Findings gain strength when they are replicated and when multiple RCTs are combined in systematic reviews and meta-analyses, which sit above the individual trial in the evidence hierarchy. Conversely, a well-designed observational study can sometimes be more informative than a flawed or under-powered RCT. The design is a powerful tool, not an automatic guarantee of truth, and its value depends on execution and transparent reporting.

    Internal versus external validity

    Two distinct questions decide whether a trial is useful. Internal validity asks whether the result is true for the participants studied — whether the design genuinely isolated the intervention’s effect from bias and confounding. External validity asks whether that result generalises to other people, settings and conditions. The RCT excels at the first: randomisation, concealment, control and blinding are precisely the tools that secure internal validity. It is weaker on the second, because the controlled conditions and selected participants that protect internal validity can make a trial less representative of routine practice. Strong evidence requires attention to both, and the two sometimes pull in opposite directions.

    Pragmatic versus explanatory trials

    This tension has produced two broad trial styles. Explanatory trials test whether an intervention can work under ideal, tightly controlled conditions — maximising internal validity and answering questions of efficacy. Pragmatic trials test whether it does work in everyday clinical settings with broader participants and fewer restrictions — favouring external validity and answering questions of effectiveness. Neither is superior in the abstract; the right choice depends on the question being asked. A regulator confirming a causal effect may want an explanatory design, while a health system deciding whether to adopt a treatment may learn more from a pragmatic one. Reporting which style a trial used helps readers interpret how far its findings should travel.

    Limits of the design

    RCTs are not universally applicable. They can be expensive, may exclude populations seen in routine practice, and are sometimes unethical or impractical — you cannot randomise people to harmful exposures. Tightly controlled conditions can also limit generalisability, the gap between efficacy (does it work in the trial?) and effectiveness (does it work in the real world?). Transparent reporting and good documentation, as encouraged in our guidance for authors, help readers judge how far a trial’s findings extend.

    Frequently asked questions

    What makes randomisation so important?

    Randomisation distributes both known and unknown confounders evenly across groups, so that observed differences in outcome can be attributed to the intervention rather than to pre-existing imbalances.

    How is allocation concealment different from blinding?

    Allocation concealment hides the upcoming assignment from those enrolling participants, protecting the randomisation itself. Blinding hides group membership after assignment to prevent biased behaviour and assessment.

    Why use intention-to-treat analysis?

    Analysing participants in their assigned groups preserves randomisation and gives a pragmatic estimate of effect under realistic adherence, avoiding bias introduced by excluding non-completers.

    When is an RCT not appropriate?

    When randomisation would be unethical, impractical or impossible — for example for harmful exposures or rare conditions — observational designs may be the only feasible option, accepting their greater vulnerability to confounding.

  • Linking grants, projects and outputs across the research lifecycle

    Research has a natural arc. A funder makes an award; the award supports a project with a team, activities and a timeline; the project produces outputs — papers, datasets, software, sometimes patents or policy contributions; and those outputs go on to have an impact. It is a single connected story. Yet in most institutions it is recorded as several disconnected ones: the grant lives in a funder’s system and a finance system, the project in a current-research-information system, and the outputs in repositories, journals and ORCID profiles — none of which reliably know about the others. The result is a fractured record in which it is surprisingly hard to answer a basic question: what did this grant actually produce? This article looks at the identifiers and standards that stitch the lifecycle back together, drawing on the research lifecycle domain of the CASRAI Dictionary.

    The fragmentation problem

    The fragmentation is not anyone’s fault; it is a consequence of these stages being managed by different organisations with different systems built at different times. But the cost is real. Funders want to demonstrate what their investment yielded and cannot easily do so. Institutions struggle to report the full output of a project. Researchers re-enter the same grant and output details into system after system. And the connective tissue — this paper came from that project, which was funded by this grant — exists only as prose in acknowledgements, if at all. The fix is to give each entity in the chain a persistent identifier and to record the links between them in a machine-readable way, so the connections survive across systems rather than living in someone’s memory.

    Grant identifiers

    The first link is the grant identifier: a persistent, resolvable identifier for a funding award. When a grant has its own identifier, every output it supports can reference it unambiguously, and the funder can in principle gather everything connected to the award without manual reconciliation. Persistent grant identifiers replace the fragile practice of citing awards by free-text grant numbers — which vary in format, get mistyped and cannot be resolved — with something a machine can follow. This is the foundation on which funder reporting and impact tracking depend, because without a stable handle on the award there is nothing reliable for outputs to point back to.

    RAiD: identifying the project itself

    The middle of the lifecycle — the project — has historically been the least well identified, and this is where RAiD (Research Activity Identifier) comes in. RAiD is an ISO-standard persistent identifier for a research project or activity. Where a grant identifier names the funding and a DOI names an output, RAiD names the project — the connecting entity that ties together the people, the institutions, the funding, and the outputs over the project’s lifetime. A RAiD record can hold these relationships in one place: who is involved (by ORCID), which organisations (by ROR), which awards fund it, and which outputs it generates. That makes the project a first-class, citable node in the graph rather than an implicit gap between funding and publication. Our explainer on what RAiD is covers how the identifier works and how it is used in practice; the essential point is that it fills the long-standing hole in the middle of the lifecycle.

    Crossref grant linking

    On the output side, Crossref grant linking provides the mechanism for connecting published outputs back to the funding that supported them. Funders can register their grants with Crossref, giving each a DOI, and publishers can include funding information in the metadata they deposit when registering an article. The two are then linked: a grant record can surface the outputs that acknowledge it, and an output record carries a resolvable reference to its funding. This turns the funding acknowledgement — previously unstructured prose that no machine could use — into a structured, navigable link. Combined with grant identifiers and RAiD, it completes the chain from funder to output and back again.

    What a fully linked lifecycle enables

    When these identifiers are in place and connected, several things become possible that were previously laborious or impossible. A funder can assemble a complete, automatically maintained picture of what an award produced — papers, datasets, software — for reporting and evaluation. An institution can report the full output of a project without manual collation. A researcher can have their outputs automatically associated with the right grant and project rather than re-keying details. And anyone can traverse the chain in either direction: from a paper to its project, funding and team, or from a grant to everything it enabled. The fractured story becomes a connected one, assembled by following identifiers rather than by hand.

    Credit and consistency across the chain

    A connected lifecycle also makes credit more complete. When outputs are linked to the project and people who produced them, the contributions recorded through the CRediT taxonomy become part of a larger picture — not just who did what on a single paper, but how that work fits into a funded project and a research career. The set of contribution roles is described in our overview of the CRediT roles. For all of this to function, a grant, a project and an output must be described consistently wherever they appear, so that a link made in one system is understood in another. That consistency is what the CASRAI Dictionary provides: the shared vocabulary that lets the whole lifecycle — funding, activity and outputs — be recorded once and connected everywhere.

  • Electronic lab notebooks and structured record-keeping across the research lifecycle

    When we picture the scholarly record, we tend to think of its end products: the published paper, the deposited dataset, the citation. But each of those is the visible tip of a much larger body of work — the active, day-to-day conduct of research, where experiments are designed and run, samples processed, instruments operated and observations recorded. For generations this working phase was captured, if at all, in the paper laboratory notebook: a bound book on a bench, legible only to its author, locked in a drawer, and disconnected from everything else. An immense amount of crucial information about how research is actually done remained invisible to the wider record. The electronic lab notebook and the structured record-keeping practices around it are changing that. This article looks at how, drawing on the research-lifecycle domain of the CASRAI Dictionary.

    What an electronic lab notebook is

    An electronic lab notebook, or ELN, is software that replaces the paper notebook as the place where researchers record their day-to-day work: experiments, protocols, observations, results and the reasoning behind decisions. At its simplest, an ELN offers obvious practical advantages over paper — it is searchable, backed up, shareable, and resistant to the coffee stains and illegible handwriting that have plagued laboratory science forever. But its deeper significance is that it makes the working record digital and therefore connectable. A paper notebook is an island; an electronic one can be linked to the protocols it follows, the instruments and samples it references, the data files it produces and the people who did the work. The ELN is the point at which the active phase of research enters the connected world that the rest of the record already inhabits.

    Capturing the active phase as connected metadata

    This is the central idea: the ELN lets the active phase of research be captured as connected metadata rather than disappearing into a drawer. When work is recorded electronically and linked properly, a rich web of relationships can be built around it — this experiment used that protocol; it was performed by these people on that instrument; it consumed these samples and produced these data files; it belongs to this project and contributes to that publication. The working phase stops being a black box between the start of a project and its outputs, and becomes a documented, navigable part of the record. This matters for reproducibility, because others can see exactly how a result was produced; for collaboration, because the record is shared rather than siloed; and for integrity, because the chain from question to result is visible rather than reconstructed after the fact.

    FAIR principles for the working record

    The same FAIR principles — Findable, Accessible, Interoperable, Reusable — that govern published data apply, with equal force, to the records created during the active phase. An ELN that captures structured, well-described records makes the working record findable and reusable in a way a paper notebook never could be. The principle is that good data management should not begin at the moment of deposit, when a project ends, but should run through the entire lifecycle, starting at the bench. If records are created in a structured, connected form from the outset, preparing data for deposit becomes a matter of harvesting and tidying what already exists, rather than reconstructing it. Good record-keeping during the active phase is, in this sense, the foundation of good data management overall.

    Provenance: the PROV standard

    A particular strength of structured electronic record-keeping is its capacity to capture provenance — the record of how something came to be: what data was used, what processes acted on it, what agents (people, software, instruments) were involved, and in what order. Provenance is the basis of trust in a result, because it lets others trace exactly how that result was produced and verify each step. The PROV standard provides a formal, machine-readable model for expressing provenance — describing the entities, activities and agents in a process and the relationships between them — so that the chain of how a result was produced can be recorded consistently and understood across systems. An ELN that captures provenance in line with such a standard turns the working record into something far more powerful than a diary: a verifiable account of how knowledge was made.

    Identifying the work itself: activity identifiers

    If the active phase is to be connected to the rest of the research landscape, the work itself needs to be identifiable. Persistent identifiers have transformed how we refer to outputs and people; the same logic is now being applied to research activities. RAiD (the Research Activity Identifier) is a persistent identifier for research projects and activities, providing a stable handle for the work itself — not just its eventual outputs. With an activity identifier, the records captured in an ELN, the data produced, the people involved and the resulting publications can all be tied to a single, persistent identity for the project. The whole arc of a piece of research — from the work as it happens to the products it yields — can then be traced as a connected whole rather than a set of disconnected fragments.

    A consistent vocabulary across the lifecycle

    For records created at the bench to connect with everything downstream — data repositories, CRIS platforms, publications — the elements they contain must mean the same thing everywhere: what a protocol, a sample, an instrument or an activity denotes. That consistency is what the CASRAI Dictionary provides: a shared vocabulary so that the record captured in an electronic lab notebook is understood identically wherever it flows. And because the work recorded there — investigation, data curation, methodology — is genuine contribution, it can be described in the same framework used for every output, the CRediT taxonomy and its full set of contribution roles. The electronic lab notebook brings the most hands-on phase of research into the connected record; structured record-keeping, provenance and activity identifiers let that phase take its rightful place in the story of how knowledge is made.

  • Open science across the research lifecycle: from preregistration to preservation

    Open science is often encountered as a set of separate practices: a journal’s open-access policy, a funder’s data-sharing requirement, a colleague’s preregistered study. Treated piecemeal, each can feel like an isolated obligation. But open science is most powerful, and most coherent, when its practices are understood as connected stages in the arc of a single project — when openness runs through the whole research lifecycle rather than appearing only at the end. Seen this way, preregistration, open data, open access and preservation are not unrelated requirements but successive expressions of one principle: that research is more trustworthy, more useful and more cumulative when it is conducted in the open. This article traces openness across the lifecycle through the research lifecycle domain of the CASRAI Dictionary.

    A global framework: the UNESCO Recommendation

    That open science is a connected whole rather than a collection of separate practices is reflected in the most significant international statement on the subject: the UNESCO Recommendation on Open Science, adopted by member states as a shared global framework. It treats open science not as a single act of sharing but as an integrated set of practices and values — open access to publications, open research data, open-source software, open infrastructures, open engagement with society — underpinned by transparency, equity and inclusion. Its scope is the point: it frames openness as a culture spanning the entire research process, not a box ticked at publication, and provides a common reference for understanding open science as a coherent lifecycle.

    The beginning: preregistration

    Openness can begin before any data are collected. Preregistration is the practice of specifying a study’s hypotheses, methods and analysis plan in advance, and recording that plan in a way that cannot be quietly changed later. Its purpose is to strengthen the integrity of research by making clear what was planned before the results were known, which guards against practices such as reshaping hypotheses to fit the data or selectively reporting only what worked. A particularly developed form is the registered report, in which a study’s plan is peer-reviewed and accepted in principle before the results exist, so that publication depends on the quality of the question and method rather than on whether the findings turn out to be striking. Preregistration makes the research process transparent from the outset and sets the foundation for everything that follows.

    The middle: open and FAIR data

    As a project generates data, openness shifts to how that data is managed and shared. The widely adopted FAIR principles hold that data should be Findable, Accessible, Interoperable and Reusable — properties that let data be discovered, understood and built upon by others rather than locked away or lost. Making data FAIR, and as open as is responsible, transforms it from a private by-product of one study into a lasting resource for the community. This stage connects backwards and forwards: data shared openly allows the results derived from it to be checked, and it allows the data itself to feed new research it was never collected for. Openness in the middle of the lifecycle is what gives a project value beyond its own conclusions.

    The output: open access

    When findings are written up, openness turns to open access — making the resulting publications freely available to read rather than locked behind paywalls. It can be achieved through different routes, including publishing in open-access venues and depositing accepted manuscripts in repositories, but the principle is constant: research that anyone can read can be verified, used and built upon by the widest possible audience. Open access is the most visible face of open science, but within the lifecycle it is one stage among several. A paper that is open but rests on hidden data and an undisclosed plan is less open than it appears; open access is most meaningful when it sits atop preregistration and open data.

    The long term: preservation

    The lifecycle does not end at publication, because outputs that are open today are worthless tomorrow if they vanish. Digital preservation is the work of ensuring that data, publications, software and other outputs remain accessible, intact and usable over the long term, against the threats of format obsolescence, link rot, storage failure and institutional change. There is little point making research open if it cannot be found or opened a decade later. Trusted repositories, persistent identifiers and active preservation practices are what keep the open record open over time, closing the loop so that the openness built earlier actually endures.

    The lifecycle as a connected whole

    The deeper point is that these stages reinforce one another. Preregistration makes the eventual open data and open publication more meaningful, because the plan they can be checked against is on record. Open data makes the open publication verifiable. Preservation makes all of it durable. Openness at one stage is weakened when a stage is missing — open access over secret data, or open data with no preservation, each falls short of the whole. This is why open science is best understood as a lifecycle rather than a checklist: its value is cumulative and connected, exactly the vision the UNESCO Recommendation articulates. Our learning resources explore each practice in more depth.

    A consistent vocabulary across the lifecycle

    For openness to connect across stages and systems, the information describing each stage must mean the same thing everywhere — the status of a preregistration, the access conditions of data, the licence on a publication, the preservation state of an output. That consistency is what the CASRAI Dictionary provides: a shared vocabulary so that the open-science attributes of a project are understood identically across the systems that record them. And because contribution runs through every stage, the work done at each can be described in the same shared framework — the CRediT taxonomy and its full set of contribution roles. Open science is not a single act but a way of working across the whole life of a project; its power lies in the connection of its parts.