Category: Perspectives

Opinion, argument, and field-shaping commentary on research-administration standards.

  • Research Data Management Policy: €10.2bn Case

    A research data management policy that treats FAIR compliance as a line-item cost, rather than a reuse and reputation asset, is the wrong accounting model. PwC estimated in a 2018 study for the European Commission that the absence of FAIR (Findable, Accessible, Interoperable, Reusable) research data costs the European economy at least €10.2 billion a year, largely through duplicated data collection and wasted researcher time. That figure is the strongest evidence available that under-investment in research data management (RDM) infrastructure is a false economy, not a saving.

    A research data management policy is an institutional document setting out the responsibilities of researchers and the institution for planning, storing, securing, sharing and preserving research data across its lifecycle. Most UK universities — Southampton, Birmingham, Manchester, Edinburgh and others — already publish one. The argument here is narrower and more contentious: most are drafted, funded and governed as compliance paperwork, when the evidence says they should be funded as reuse and reputation infrastructure.

    Why RDM policy gets treated as a cost centre

    Institutional budgets typically classify research data management as overhead: storage costs, repository subscriptions, a data steward’s salary, training time. Each appears as a debit with no offsetting credit line, because savings from avoided duplication and faster reuse accrue diffusely, across future researchers and grants, not to the budget holder who paid for the infrastructure.

    This accounting mismatch is compounded by how the data management plan (DMP) requirement is handled in practice. Most funders now mandate one, but research offices frequently treat it as a box-ticking exercise completed at proposal stage and never revisited, rather than a live operational document. That framing under-serves the researcher, who gets no practical reuse benefit, and the institution, which under-recovers the true cost of good RDM from grants that would pay for it.

    UK Research and Innovation (UKRI) explicitly states that costs associated with research data management — storage, curation, repository deposit — are eligible for recovery under its funding. Institutions treating RDM as unfunded overhead are frequently leaving recoverable grant money unclaimed rather than avoiding a cost.

    What the evidence actually says about FAIR and avoided cost

    The FAIR data principles were formalised in 2016 by Wilkinson et al. in Scientific Data as a guide for making digital assets Findable, Accessible, Interoperable and Reusable by both humans and machines. FAIR data is not a compliance checkbox; it is a design standard for making data usable by someone who was not present when it was collected.

    The clearest attributed cost estimate comes from PwC’s 2018 cost-benefit analysis for the European Commission, which put the annual cost of non-FAIR research data to the European economy at €10.2 billion, driven by researcher time lost searching for data, recreation of data that already exists, and lost interdisciplinary reuse. A separate, frequently cited illustration is the University of Minnesota’s decades-long diet study, whose original data nearly disappeared into storage before being recovered and reanalysed — a reminder that data loss is a recurring, avoidable event when retention and documentation are afterthoughts.

    Three mechanisms explain where the savings actually come from:

    • Avoided duplication. Findable, well-described data lets a second researcher build on an existing dataset instead of re-running a costly collection exercise.
    • Faster reuse cycles. Interoperable data in standard formats with persistent identifiers can be integrated into new analyses without reformatting or re-negotiating access.
    • Preserved institutional memory. Deposit in a certified repository protects data against the single most common loss vector: staff turnover and undocumented local storage.

    None of this shows up as a saving on a university’s annual accounts, which is precisely why RDM investment is chronically under-prioritised relative to its documented return.

    How funder compliance requirements are changing the calculus

    Funder mandates are steadily converting FAIR data from voluntary good practice into a hard compliance gate, which changes the institutional risk calculus even for leaders unconvinced by the reuse argument. UKRI’s Common Principles on Research Data, and the underlying Concordat on Open Research Data, require a data management plan for funded research and state that data should be made openly available with as few restrictions as necessary. Horizon Europe applies comparable requirements, and cOAlition S’s Plan S pushes the same expectations into journal-level open-access policy.

    A comparison of how three major funders frame the requirement illustrates the convergence:

    Funder / framework Core RDM requirement FAIR reference
    UKRI Data management plan for funded research; RDM costs eligible for recovery Endorses FAIR via the Concordat on Open Research Data
    Horizon Europe DMP required within six months of project start, updated across lifecycle “As open as possible, as closed as necessary,” explicitly FAIR-aligned
    cOAlition S (Plan S) Underlying data should accompany open-access publications References FAIR principles for supporting data

    Institutions that fund RDM only to the minimum needed for a single grant’s DMP template are exposed twice: to duplicated administrative cost when infrastructure is rebuilt project by project, and to compliance risk as funders move toward auditing DMP adherence rather than merely requiring its submission.

    The case for investing in data stewardship, not just policy text

    A policy document alone does not create FAIR data. That requires people: a data steward function — a dedicated role, a network of disciplinary data champions, or a research data service embedded in the library — able to advise researchers on repository choice, metadata standards and licensing at the point where those decisions are actually made, not after the fact.

    Institutions that fund this role tend to route researchers toward standards-based infrastructure rather than ad hoc local storage: a research data repository registered in re3data.org, ideally holding Core Trust Seal certification, with persistent identifiers (DOIs) and standard metadata attached to every deposit. This is the practical, unglamorous mechanism by which the €10.2 billion estimate above is actually avoided — not through a policy PDF, but through a person and a repository that make FAIR operational.

    CASRAI’s relevance here is provenance and interoperability, not ownership. CASRAI originated the CRediT contributor role taxonomy in 2014, now stewarded by NISO as ANSI/NISO Z39.104-2022 — the same underlying argument in a different domain: standardising who-did-what reduces duplicated verification effort just as standardising data description reduces duplicated data collection. Institutions weighing their research administration infrastructure should treat RDM policy, contributor attribution and open data reuse as one reputational and efficiency system, not separate obligations.

    Answer-first Q&A

    What is a research data management policy?

    A research data management policy is an institutional document defining responsibilities for planning, storing, securing, sharing, and archiving research data across its lifecycle. UK universities including Edinburgh and Manchester publish theirs publicly, typically requiring a data management plan at proposal stage and deposit in an approved repository after project completion.

    What are the FAIR data principles?

    The FAIR data principles — Findable, Accessible, Interoperable, Reusable — were published by Wilkinson et al. in 2016 in Scientific Data as guidance for making digital research assets usable by both humans and machines, through persistent identifiers, standard metadata, and clear licensing.

    Do UK and EU funders require a data management plan?

    Yes. UKRI requires a data management plan for funded research and treats RDM costs as eligible for recovery, while Horizon Europe requires a DMP within six months of project start under its “as open as possible, as closed as necessary” principle.

    How much does poor research data management actually cost?

    PwC’s 2018 analysis for the European Commission put the annual cost of non-FAIR research data to the European economy at €10.2 billion, driven primarily by duplicated data collection and researcher time lost searching for data that already exists elsewhere.

    Implications for institutional leaders

    The practical implication is a reframing exercise, not necessarily a large new budget line. Research offices should cost RDM infrastructure — repositories, data steward time, metadata training — against the funder-eligible recovery already available through DMP-linked grants, rather than absorbing it as unfunded overhead. Leaders reviewing their research data management policy should ask whether it funds a data steward with real authority over repository choice and metadata quality, or whether it is a document that satisfies a compliance checklist and stops there.

    The evidence — a €10.2 billion EU-wide cost estimate, UKRI’s funding eligibility for RDM costs, and Horizon Europe’s escalating DMP requirements — points one direction: institutions that keep treating FAIR compliance as a cost centre are choosing to keep paying the duplication tax FAIR data was designed to eliminate.

  • Limitations of Bibliometrics: DORA and CoARA

    Bibliometrics — the statistical analysis of publication and citation data — cannot reliably stand in for research quality on its own: field-specific citation practices, author self-citation, and outright metric gaming all distort single-number scores such as the h-index or Journal Impact Factor. This is the documented evidentiary basis for DORA and CoARA’s push to replace single-score evaluation with qualitative, multi-indicator assessment.

    Bibliometrics is the quantitative study of academic literature — citation counts, publication volume, and derived indices — used as a proxy for scholarly influence. The proxy breaks down whenever a single number is asked to carry the full weight of a quality judgement, which is precisely what large-scale hiring, promotion, tenure, and funding panels have done for decades.

    What is bibliometrics, and why does one score fall short?

    Bibliometric indicators — citation counts, the h-index, the Journal Impact Factor (JIF), and derived composite scores — were built for large-scale, aggregate comparisons, not for judging an individual scholar’s contribution. Bergstrom, West and Wiseman’s 2008 analysis in the Journal of Neuroscience put it plainly: quantitative metrics are poor choices for assessing an individual’s research output compared with the “gold standard” of reading the work and consulting domain experts.

    A single score compresses conflicting dimensions of scholarly value — novelty, rigour, reproducibility, societal reach — into one figure. That compression, not citation data itself, is the structural weakness reform movements target.

    How does field bias distort bibliometric comparisons?

    Citation practices vary sharply by discipline, so raw citation counts cannot be compared across fields. Mathematics and the humanities publish and cite far less frequently than biomedicine, and books and conference proceedings — the dominant outputs in many humanities and computing sub-fields — are tracked inconsistently, or not at all, by Web of Science and Scopus.

    Coverage gaps compound the bias. Indexing databases differ in subject breadth, subject depth, geographic coverage, language coverage, and how far back citation histories extend, so researchers publishing outside the Anglophone, journal-dominant core of a database are systematically under-counted. Belter’s 2015 review in PMC also notes that citation-based indicators require roughly two to three years after publication before they stabilise enough to be considered reliable — a lag that penalises early-career researchers and recent work by design.

    Why does self-citation inflate bibliometric scores?

    Self-citation — an author citing their own prior work — is a normal and often legitimate part of building on a research programme. It becomes a distortion when it is used strategically to inflate an individual’s citation count or a journal’s Impact Factor beyond what independent uptake of the work would justify.

    Clarivate’s Journal Citation Reports has, in past cycles, suppressed the calculated Impact Factor of titles found to display anomalous citation behaviour, including excessive journal self-citation and coordinated “citation stacking” arrangements between journals — a documented, database-level enforcement action against exactly this failure mode. At author level, unusually concentrated self-citation rates are one of the diagnostic flags bibliometricians use when auditing whether a headline citation figure reflects genuine external uptake or engineered inflation.

    Does field-weighted citation impact solve the problem?

    Field-weighted citation impact (FWCI) is a normalised metric — used in tools such as Scopus/SciVal — that adjusts a publication’s citation count against the average for its subject field, publication year, and document type, so that a score of 1.0 represents “as expected” performance for that context. It is a genuine improvement on raw citation counts because it corrects for the field-bias problem described above.

    FWCI does not, however, correct for self-citation gaming or database coverage gaps, and it remains a single number: it shows how a paper performed against a benchmark, not whether the research was rigorous or original. Reform frameworks treat field normalisation as a refinement of bibliometrics, not a licence to keep using any single indicator as a proxy for quality.

    What evidence underlies DORA and CoARA’s reform case?

    The San Francisco Declaration on Research Assessment (DORA), launched in 2012, explicitly recommends against using the Journal Impact Factor as a surrogate measure of the quality of individual research articles, and calls on institutions to assess research on its own merits using a range of qualitative and quantitative indicators. The Coalition for Advancing Research Assessment (CoARA), formed in 2022, builds on DORA’s diagnosis: its signatories commit to basing assessment primarily on qualitative, peer-reviewed judgement, supported by responsible — not exclusive — use of quantitative indicators, and to abandoning inappropriate use of journal- and publication-based metrics such as the JIF and h-index.

    Both build directly on the failure modes above: field bias, self-citation gaming, database coverage gaps, and the two-to-three-year reliability lag are the documented evidence, not abstract principle, behind the push for reform.

    Initiative Launched Core commitment
    DORA (San Francisco Declaration on Research Assessment) 2012 Stop using the Journal Impact Factor as a proxy for individual article or researcher quality
    Leiden Manifesto 2015 (Hicks et al., Nature 520, 429–431) Ten principles for the responsible, transparent use of quantitative indicators alongside expert judgement
    CoARA (Coalition for Advancing Research Assessment) 2022 Base assessment primarily on qualitative peer review; abandon inappropriate JIF/h-index use in hiring, promotion and funding decisions

    Answer-first questions on bibliometric limitations

    What are the main limitations of bibliometrics in research assessment?

    The main limitations are field bias (citation norms differ by discipline), database coverage gaps (books, non-English and non-journal outputs are under-tracked), self-citation inflation, and a two-to-three-year lag before citation counts stabilise. Together these mean a single score cannot substitute for expert, qualitative judgement of research quality.

    Why is the h-index considered a poor measure of individual research quality?

    The h-index rewards volume and career length over insight, cannot distinguish a highly cited author from a member of a large collaborative team, and does not account for field-specific citation norms. Bergstrom, West and Wiseman (2008) concluded that reading the work and consulting experts remains the more reliable standard for individual evaluation.

    What is the difference between DORA and CoARA?

    DORA (2012) is a signable declaration focused primarily on eliminating Journal Impact Factor misuse. CoARA (2022) is a membership coalition of funders, universities and academies that goes further, committing signatories to a broader, peer-review-centred reform agenda across hiring, promotion, and institutional evaluation, with periodic reporting on progress.

    What is a self-citation rate and why does it matter?

    A self-citation rate is the proportion of an author’s or journal’s total citations that come from their own prior work rather than independent external uptake. Bibliometricians and citation-database auditors (including Clarivate’s Journal Citation Reports process) use unusually high self-citation rates as a flag for possible metric gaming rather than genuine scholarly influence.

    What should research administrators do differently?

    For research administrators and institutional leaders, the practical implication is not to discard citation data but to stop letting any single figure carry a hiring, promotion, or funding decision unsupervised. That means:

    • Pairing field-normalised indicators such as FWCI with narrative, qualitative peer assessment, as CoARA commitments require.
    • Auditing self-citation and journal self-citation patterns before citing a headline figure in a case file.
    • Recognising a fuller range of outputs — datasets, software, policy influence — rather than journal articles alone.
    • Crediting individual contributions on multi-author papers explicitly, rather than inferring credit from author position or aggregate citation share.

    On that last point, standardised contributor-role taxonomies address a related gap directly. CASRAI originated the CRediT contributor role taxonomy in 2014; the standard is now stewarded by NISO as ANSI/NISO Z39.104-2022, and it lets institutions record which named contributor performed which specific role on a paper — conceptualisation, data curation, writing — rather than relying on citation share or author-list position as a proxy for who did what.

    Where bibliometric reform goes next

    The evidentiary case against single-number bibliometric scores is now well established: field bias, database coverage gaps, self-citation gaming, and a multi-year reliability lag are documented, auditable failure modes, not theoretical objections. DORA and CoARA translate that evidence into institutional commitments, and field-normalised metrics such as FWCI narrow — without eliminating — the field-bias problem.

    The direction of travel for funders, universities and academies is toward layered assessment: responsibly used quantitative indicators, transparent contributor-role attribution, and peer judgement at the centre, rather than any one score standing alone.

  • Is Self-Citation Ethical in Responsible Metrics?

    Is self-citation ethical? Self-citation is ethical when an author cites their own prior work because it is genuinely relevant to a new argument, method, or dataset; it becomes unethical only when the primary motive shifts to inflating citation counts, h-index, or a journal’s impact factor. Neither DORA nor CoARA — the two dominant responsible-metrics frameworks — sets a self-citation rule, leaving this judgement almost entirely to editors, reviewers, and individual conscience.

    Self-citation is the practice of an author referencing their own previously published work within a new publication, most commonly to establish methodological continuity, avoid self-plagiarism, or trace the development of a research programme over time.

    What counts as self-citation, and why do researchers do it?

    Self-citation occurs whenever an author lists their own prior publication in a new paper’s reference list. It is neither rare nor inherently suspect: most research is cumulative, and a study that builds on a researcher’s earlier method, dataset, or theoretical framework has good reason to cite that earlier work directly.

    • Establishing methodological continuity with a previously validated technique or instrument
    • Avoiding self-plagiarism by properly attributing earlier text, data, or ideas
    • Tracing the trajectory of a multi-paper research programme for the reader
    • Providing background the author is best placed to cite because they generated the original finding

    The Committee on Publication Ethics (COPE) has noted that failing to cite one’s own directly relevant prior work can itself mislead readers into thinking a study is more novel than it is — so the ethical failure mode runs in both directions, not only toward over-citation.

    How much self-citation is considered excessive?

    There is no single, universally agreed self-citation rate ceiling. A 2023 analysis published in PMC concluded that a self-citation rate around 20 percent is conservatively tolerable for individual researchers, with rates substantially above that treated as inappropriate — but the same paper stresses that discipline size and publication norms shift what counts as normal.

    COPE’s own November 2017 forum discussion, “Self-Citation: Where’s the Line?”, found no consensus figure among editors. Some journals cap the absolute number of self-citations (for example, no more than five), others use a percentage-of-total-references ceiling, and many rely on case-by-case editorial judgement rather than a fixed rule. COPE’s broader position on handling citation manipulation asks journals to set their own thresholds and educate authors, rather than prescribing one number for the whole of scholarly publishing.

    A 2025 analysis in the Journal of Academic Ethics (Springer) reinforces the intent-based test over a rate-based one, concluding that “ethical reviewers should avoid unnecessary self-citation” while allowing that citing one’s own work is acceptable “if directly relevant” — the same relevance-over-frequency logic COPE applies.

    Why don’t DORA and CoARA address self-citation directly?

    The San Francisco Declaration on Research Assessment (DORA, 2012) is aimed squarely at eliminating the use of the journal impact factor as a proxy for individual researcher quality in hiring, funding, and promotion decisions. It says nothing about how many times an author may cite themselves within a paper’s reference list — that is a citation-practice question, not a journal-metric question, and sits outside DORA’s original scope.

    The Coalition for Advancing Research Assessment (CoARA), formed in 2022, commits signatory institutions to move away from inappropriate use of quantitative indicators and toward qualitative, narrative-based evaluation. This is the closest thing academia has to a responsible-metrics consensus position, yet CoARA’s Agreement likewise does not name self-citation as a distinct risk category — it addresses metric misuse at the institutional and assessment level, not individual reference-list behaviour.

    The result is a genuine governance gap. Self-citation sits between two policy domains — publication ethics (COPE’s territory) and research assessment reform (DORA and CoARA’s territory) — without either treating it as a first-class concern. Editors are left applying inconsistent journal-level rules, while institutional assessment reformers focus almost entirely on how metrics are used rather than on what feeds into them.

    Disclosure norms vs blanket caps: the better governance model

    A blanket percentage cap on self-citation is easy to state but poorly matched to how research actually varies. Small or emerging subfields with few active authors, first-in-series methodology papers, and long-running research programmes will all show naturally higher self-citation rates than a large, well-established field — penalising a rate rather than the intent behind it risks punishing legitimate continuity while doing little to stop a determined metric-gamer, who can simply keep self-citations just under whatever line is drawn.

    A more workable precedent already exists in bibliometrics. The standardized citation-metrics database maintained by Ioannidis, Boyack, and Baas — used to identify the world’s most-cited scientists across disciplines — reports each author’s composite citation score both with and without self-citations included, alongside their raw self-citation percentage. It does not impose a cutoff; it makes the number visible and lets the reader judge. That is a disclosure model, not a cap.

    Framework Year Position on self-citation Governance model
    COPE 2017/ongoing Case-by-case editorial judgement; no fixed universal threshold Journal-level policy, editorial discretion
    DORA 2012 Not addressed; targets impact-factor misuse in assessment Institutional assessment reform
    CoARA 2022 Not addressed; targets inappropriate metric use generally Institutional assessment reform
    Ioannidis/Boyack/Baas database 2019, updated annually Reports self-citation rate transparently alongside adjusted score Disclosure, no cap
    Individual journal caps Varies Fixed number or percentage limit on self-citations Blunt rule, inconsistently applied

    Applying that same logic to individual authors and grant applicants is straightforward: require a disclosed self-citation rate alongside any citation-based metric submitted for hiring, promotion, or funding decisions, rather than an arbitrary cap that cannot distinguish a legitimate methods lineage from deliberate metric inflation.

    Answer-first Q&A on self-citation ethics

    Is self-citation unethical?

    Self-citation is not inherently unethical. It becomes ethically problematic only when it is used to inflate citation metrics rather than to serve genuine scholarly continuity — what COPE treats as a form of citation manipulation. Relevance to the argument, not frequency, is the ethical test that matters.

    Is it okay to cite yourself in a research paper?

    Yes. Citing your own prior work is standard practice when it establishes methodological continuity, avoids self-plagiarism, or shows how a study builds on earlier findings. Problems arise only when self-citations serve no argumentative purpose beyond raising an author’s h-index or a journal’s impact factor.

    Is self-citation illegal?

    No. Self-citation is a matter of publication ethics, not law. Excessive or irrelevant self-citation can breach a journal’s editorial policy or COPE’s citation-manipulation guidance, potentially triggering a correction or editorial inquiry, but it carries no legal liability in any jurisdiction.

    Implications for journals, funders, and institutions

    Journals can adopt the disclosure model directly: require authors to report a manuscript’s self-citation percentage at submission, alongside a one-line rationale where the rate is unusually high, rather than enforcing an arbitrary cap during peer review.

    CoARA signatories reforming promotion and funding criteria are well placed to extend their existing move toward narrative CVs by asking applicants to disclose self-citation-adjusted metrics alongside any citation count submitted for assessment — consistent with CoARA’s broader commitment to context over raw indicators.

    DORA signatories evaluating individual researchers already commit to judging research on its own merits rather than by journal-level proxies; adding a self-citation disclosure line to that practice would close a gap the original 2012 declaration was never designed to cover.

    Conclusion: toward transparent, not punitive, norms

    Self-citation is not a solved problem in responsible metrics guidance — it is an unaddressed one. DORA targets journal-level metric misuse; CoARA targets institutional assessment culture; COPE offers editorial case law without a universal rule. None of the three treats individual self-citation disclosure as a named requirement.

    The fix does not need a new blanket percentage cap, which would misfire across disciplines of different sizes and publication norms. It needs a disclosure norm: report the self-citation rate, report the rationale where it is high, and let editors, funders, and hiring committees judge intent with that information in hand — the same logic that already underpins the field’s most credible standardized citation databases.

  • OpenAlex: The Case for Open Research Metrics

    OpenAlex is a free, CC0-licensed index of more than 319 million scholarly works, authors and institutions, built by the non-profit OurResearch to replace the discontinued Microsoft Academic Graph. For institutions weighing research-metrics platforms, its open data answers a question closed commercial indices cannot: who can audit the numbers behind an assessment decision.

    OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open-access mode, named after the Library of Alexandria. That single design choice — publishing the full dataset under a public-domain licence rather than behind a subscription wall — is what separates it structurally from Elsevier’s Scopus and Clarivate’s Web of Science, and why it has become a reference point in debates about research-assessment transparency.

    What Is OpenAlex?

    OpenAlex launched in January 2022, built by OurResearch (a US non-profit operating as Impactstory, Inc.) as a successor to the Microsoft Academic Graph, which Microsoft stopped updating on 31 December 2021. The project inherited MAG’s dataset and rebuilt it as an open, queryable graph of works, authors, institutions, funders, and topics.

    Two design decisions define the platform. First, the entire dataset is released under a Creative Commons Zero (CC0) licence, meaning any institution, developer, or researcher can download, redistribute, and build on it without permission or cost. Second, OpenAlex has formally adopted the Principles of Open Scholarly Infrastructure (POSI), a governance commitment covering sustainability, community control, and data portability.

    The scale is now substantial. OpenAlex’s own catalogue reports more than 319 million scholarly works, and its API handled roughly 115 million queries a month in 2024, according to figures cited in the platform’s Wikipedia entry. It draws source data from Crossref, ORCID, DOAJ, and Unpaywall rather than from a closed editorial pipeline.

    How Does OpenAlex Compare with Scopus and Web of Science?

    The practical difference is not just price — it is what each platform lets an institution verify. Scopus and Web of Science apply proprietary, selective journal-inclusion criteria and sell access to the resulting index. OpenAlex indexes broadly by default and publishes the inclusion logic as open code, which means an institution can inspect exactly why a work is or is not counted.

    Dimension OpenAlex Scopus (Elsevier) Web of Science (Clarivate)
    Governance Non-profit (OurResearch), POSI-aligned Commercial publisher Commercial data company
    Data licence CC0, fully open, bulk download Proprietary, licensed access only Proprietary, licensed access only
    Core journal metric No proprietary journal metric CiteScore (four-year citation average) Journal Impact Factor
    Coverage approach Broad, automated aggregation, strong Diamond OA and non-English coverage Curated, selective journal list Curated, selective journal list
    Cost to institutions Free API; optional paid support tier Subscription Subscription

    CiteScore, Scopus’s flagship journal metric, averages the citations a journal’s documents receive over a four-year window — a useful signal, but one calculated entirely inside a closed system that institutions cannot independently reproduce. OpenAlex does not publish an equivalent branded journal score; instead it exposes the underlying citation and work-level data so that any bibliometrician can calculate their own indicator and show their working.

    Coverage differences matter for equity as much as accuracy. A 2024 study cited in OpenAlex’s Wikipedia entry found the platform indexes more than 12,500 Diamond Open Access journal titles, including over 60% of Diamond OA journals absent from both Web of Science and Scopus — a direct consequence of not gating inclusion behind a commercial selection committee.

    Why Does Open Metrics Infrastructure Serve DORA’s Transparency Principle?

    The San Francisco Declaration on Research Assessment (DORA), first published in 2012, asks funders, institutions, and publishers to stop substituting journal-based proxies for direct evaluation of research and to be explicit about the criteria used in funding, hiring, and promotion decisions. That explicitness requirement is where the platform choice stops being neutral.

    A closed index can tell an institution that a number was calculated a certain way, but it cannot let that institution independently verify how, because the underlying citation graph is licensed, not published. An open metadata layer removes that opacity: the same dataset an institution cites in a tenure file or a funding report can be downloaded, re-run, and checked by anyone, including the researcher being assessed.

    Adoption evidence has followed the argument. Leiden University announced in September 2023 that it would produce an open-source edition of its CWTS Leiden Ranking using OpenAlex data from 2024 onward. Sorbonne University announced in December 2023 that it was withdrawing its Scopus subscription in favour of OpenAlex. In 2024, France’s Ministry of Higher Education and Research pledged financial support to the project, describing it as “crucial open science infrastructure,” and the Arcadia Fund awarded OurResearch a $7.5 million grant explicitly to build OpenAlex into a sustainable alternative to commercial citation indices.

    • Leiden University: open-source CWTS Leiden Ranking edition built on OpenAlex data (from 2024)
    • Sorbonne University: Scopus subscription withdrawn in favour of OpenAlex (December 2023)
    • French Ministry of Higher Education and Research: financial commitment to OpenAlex as open science infrastructure (2024)
    • Arcadia Fund: $7.5 million grant to OurResearch for OpenAlex sustainability (March 2024)

    None of this means closed indices lack value; their curated selection and mature analytics tooling still suit some high-stakes evaluations. But where the explicit requirement is transparency rather than convenience, an auditable, CC0-licensed data layer meets DORA’s stated principle more directly than a licensed black box.

    Common Questions About OpenAlex

    What is OpenAlex used for?

    Universities, funders, and publishers use OpenAlex to track publication output, measure open-access status, benchmark institutional performance, and feed alternative rankings such as the open-source CWTS Leiden Ranking. Its free API also underpins third-party dashboards, systematic-review tools, and research-information systems that need citation and affiliation data without a subscription fee.

    Is OpenAlex legit?

    Yes. OpenAlex is maintained by OurResearch, a non-profit with a multi-year record of building open scholarly infrastructure, and it has formally adopted the Principles of Open Scholarly Infrastructure (POSI). Its data and methodology are openly licensed and auditable, and the platform is already cited in peer-reviewed scientometrics research, including a 2022 arXiv paper by its founders.

    Is OpenAlex free?

    Yes. The full dataset is released under a Creative Commons Zero (CC0) public-domain licence, and the REST API can be queried without a subscription, unlike Scopus or Web of Science. A polite-pool rate limit applies to unauthenticated use, and OurResearch offers an optional paid support tier for high-volume institutional queries.

    Who owns OpenAlex?

    OpenAlex is created and maintained by OurResearch, a US-based non-profit operating as Impactstory, Inc., not by a commercial publisher. Governance sits with a mission-driven organisation rather than a shareholder-owned company — the structural distinction that underpins its CC0 licensing and its appeal to institutions pursuing publisher-independent, DORA-aligned metrics.

    What Should Institutional Leaders Do Next?

    Platform choice is now a governance decision, not just a procurement one. An institution that cites OpenAlex data in a promotion case, a funding report, or an open-access dashboard is making a transparency claim as well as a metrics claim, and that claim should be tested before it is relied upon.

    • Map which existing assessment workflows (tenure, funding reports, rankings submissions) rely on a metric an evaluator cannot independently reproduce.
    • Pilot OpenAlex alongside — not instead of — existing subscriptions, comparing coverage gaps directly against Scopus or Web of Science outputs for your own institutional corpus.
    • Document data provenance explicitly in assessment criteria, consistent with DORA’s requirement for stated, auditable methodology.
    • Track POSI-aligned infrastructure commitments (OpenAlex, CrossRef, ORCID, ROR) as the durable layer beneath any commercial tool an institution also chooses to license.

    Open, non-proprietary metadata will not replace every function a commercial index performs today. But as funders and assessment reformers keep pressing for auditable evidence over proprietary scores, institutions that already understand — and can reproduce — their own metrics will be the ones best placed to defend them.

  • REF 2029 Academic Employment Uncertainty for Contract Staff

    REF 2029’s decision to weaken output portability, then partially reverse that decision after a three-month pause in late 2025, has left fixed-term and early-career researchers unsure whether published work will count towards their next job. A five-year portability window now applies to long-form outputs such as monographs, but shorter outputs generally stay with the institution that supported them — a “half in, half out” compromise that unions and sector commentators say still leaves contract staff exposed.

    The Research Excellence Framework (REF) 2029 is the seventh national exercise assessing the quality of research produced by UK higher education institutions, run jointly by the UK’s four higher education funding bodies, with submissions due in autumn 2028.

    What changed in REF 2029’s portability rules?

    REF 2029’s original proposal effectively ended portability: outputs would stay attached to the institution that employed the researcher when the work was produced, even after that researcher left. This was designed to stop institutions “poaching” research-active staff shortly before a census date purely to inflate a submission.

    Following the 2025 pause, the REF team confirmed on 10 December 2025 that long-form outputs — principally monographs — would carry a five-year portability window, meaning a researcher can take these specific outputs to a new institution for up to five years from publication. Shorter outputs remain governed by the decoupling principle: an institution can still submit work by a researcher who has since departed. The REF team also reinstated a recommended maximum of five outputs per researcher, having earlier proposed removing any minimum.

    Element REF 2021 REF 2029 (post-pause, Dec 2025)
    Outputs / Contributions to Knowledge and Understanding weighting 60% 55%
    Impact / Engagement and Impact weighting 25% 25%
    Environment / Strategy, People and Research Environment weighting 15% 20%
    Output portability for long-form work Full portability 5-year window (monographs)
    Output portability for standard outputs Full portability Decoupled from researcher
    Recommended output cap per researcher No fixed cap 5 (reinstated)

    Why was REF 2029 paused in 2025 — and what resumed?

    Research England, on behalf of the four UK funding bodies, confirmed a three-month pause in REF 2029’s criteria-setting process from September 2025. UKRI stated the pause was needed “to take stock and ensure alignment with the UK government’s priorities and vision for higher education.” The pause followed sustained pushback over the proposed end to output portability: in Times Higher Education on 23 September 2025, scholars argued that breaking the link between researchers and their outputs “harms academic mobility and disciplinary excellence.”

    Criteria setting resumed on 10 December 2025, with the REF team publishing revised guidance covering portability, output caps, and the renamed Strategy, People and Research Environment (SPRE) element. The SPRE weighting is split 60% institution-level statement and 40% unit-level statement, replacing the single Environment statement used in REF 2021.

    How does this affect fixed-term and early-career researchers?

    Fixed-term and early-career researchers are disproportionately exposed because their career currency is recent published output, and they move institutions more frequently than staff on permanent contracts. Under REF 2029’s decoupling principle, a researcher who leaves a post before the next census period may find that shorter-form outputs they produced stay credited to the former employer, with no guarantee the new institution can submit the same work.

    REF 2029 also introduces a substantive-link test for counting outputs from staff on part-time or non-standard contracts: at least 0.2 FTE and 12 months of contracted employment with a documented “research expectation.” Guidance does not require institutions to prove that time, funding or workload relief was actually provided to support that research — a gap flagged by commentators writing for Wonkhe in December 2025, who noted the term “research expectation” “remains vague” and can amount to “little more than a nominal clause.”

    A peer-reviewed analysis published in Transactions of the Institute of British Geographers (Wiley) goes further, warning that “the growing uncertainties around REF 2029 are likely to foster a drift towards greater reliance on metrics and procedural compliance” — a dynamic that tends to disadvantage staff without secure, long-term contracts who cannot easily demonstrate institutional “sustainability.”

    • Researchers negotiating a move should ask prospective employers directly whether specific outputs will be portable under the five-year monograph window or excluded under decoupling.
    • Contract length and FTE now matter for REF eligibility, not just for pay and pension — a role below 0.2 FTE or under 12 months may not generate a countable “significant responsibility for research” record in HESA data.
    • The reinstated five-output cap changes competitive dynamics: fewer, stronger outputs may now carry more weight than a large back-catalogue built across several employers.

    What have unions and sector bodies said?

    The University and College Union (UCU), the main trade union representing UK academic and research staff, has for several REF cycles argued that assessment periods create incentives for institutions to concentrate research-active contracts around census dates rather than offer secure, long-term posts — a pattern that REF 2029’s shift to HESA-derived staff volumes was partly designed to reduce, since submissions no longer require institutions to name individual staff.

    Russell Group universities issued a joint statement on 10 December 2025 welcoming the resumption of criteria setting, while a Wonkhe analysis the same day observed that REF 2029 “talks about people again” through SPRE but that “early career labour is still hard to see” in how research contribution is actually counted. Research Professional News reported that the reinstated five-output cap and monograph portability window were the two concessions the sector had pushed hardest for during the pause.

    Common questions on REF 2029 employment uncertainty

    What are the key changes for REF 2029?

    REF 2029 rebalances weightings toward Strategy, People and Research Environment (up to 20%) and away from outputs (down to 55%), replaces individual staff submission with HESA-derived staff volume, reinstates a five-output cap per researcher, and grants five-year portability only to long-form outputs such as monographs.

    Why has REF 2029 been paused?

    Research England paused REF 2029’s criteria-setting process for three months from September 2025 following sector concern over the proposed end to output portability, stating the pause would allow the funding bodies to “take stock and ensure alignment” with government priorities before finalising guidance.

    Are REF outputs portable?

    Only partially. REF 2029 grants a five-year portability window to long-form outputs like monographs when a researcher changes institution. Shorter, standard outputs are generally decoupled — they can still be submitted by the former employer even after the researcher has left.

    Why is REF 2029 important for research careers?

    REF outcomes shape roughly £2 billion a year in England’s quality-related research funding allocation, so how outputs, portability and staff volume are counted directly affects hiring, promotion and contract-renewal decisions — making REF 2029’s rules a material factor in academic job security, not just an institutional accounting exercise.

    What should contract staff and institutions do now?

    For fixed-term and early-career staff, the practical response is to treat portability status as a standard question in job negotiations, alongside salary and workload — not an afterthought discovered after a move. Institutions preparing REF 2029 codes of practice should document, in writing, how “research expectation” is defined for non-standard contracts, given that ambiguity here is precisely what commentators have flagged as the mechanism through which precarity goes uncounted.

    The debate is unlikely to close cleanly. REF 2029’s guidance remains subject to further sector consultation ahead of the autumn 2028 submission, and the five-year monograph window will itself need testing against real career moves before its effect on mobility is clear. What is already established is that portability is no longer a settled default in UK research assessment — it is now a negotiated, output-type-specific rule that early-career and fixed-term staff need to understand before, not after, they change jobs.

  • Benefits of an ORCID iD Beyond Compliance

    The benefits of an ORCID iD go well beyond satisfying a funder’s checkbox. A free, persistent 16-digit identifier separates a researcher’s work from every other person who shares their name, follows them across every job change without re-registration, and lets publishers, funders and repositories pull existing data instead of asking for it again. Adopting one before a mandate forces the issue is a reputational and administrative decision that pays off on its own terms.

    An ORCID iD (Open Researcher and Contributor ID) is a non-profit, community-governed identifier that anyone can register for free in under two minutes. It exists to solve a problem that predates any funder policy: author name ambiguity across a fragmented, multi-employer research career.

    What Is an ORCID iD and What Is It Used For?

    An ORCID iD is a unique, persistent digital identifier assigned to an individual researcher, not to an institution, a job title, or a specific publication. It is used to attach a person’s name, affiliations, works, peer-review activity and grant history to one stable record that follows them for life.

    ORCID launched its registry on 16 October 2012 as an independent, non-profit organisation built specifically to fix author misattribution. The registry reached one million registrations by November 2014 and ten million by November 2020, according to ORCID’s own milestone announcements — a growth curve that tracks the steady expansion of mandates from publishers and funders, not the reverse.

    What it is used for in practice: linking manuscript submissions to a verified author record, auto-populating grant applications, crediting peer review and editorial work that never appears on a traditional CV, and giving repositories and CRIS systems a single key to match a person across systems.

    Why Is Name Disambiguation the Strongest Case for Registering?

    Name collision is the single biggest threat to accurate research attribution, and it has nothing to do with whether a funder mandates an identifier. Common surnames, mid-career name changes (marriage, divorce, gender transition, religious conversion, transliteration) and inconsistent use of initials all cause work to be merged with, or split from, the wrong author.

    The scale of the problem is easy to underestimate. In library-science literature on author disambiguation, China’s three most common surnames — Wang, Li and Zhang — are routinely cited as covering more than a fifth of the country’s population, illustrating how unreliable a name alone is as an identifier once a research community spans billions of potential name-holders. An ORCID iD sidesteps the problem entirely: the identifier, not the string of characters in a byline, is what systems match on.

    • Distinguishes researchers who share an identical name, including within the same institution or field.
    • Survives a legal name change without breaking the link to prior publications.
    • Resolves transliteration inconsistencies across alphabets and naming conventions.
    • Lets a researcher claim credit for peer review, editorial board service and datasets that a CV alone cannot verify.

    How Does an ORCID iD Move With You Between Employers?

    An ORCID iD is registered to the individual, never to an institution, so it survives every job change, fellowship, sabbatical and cross-border move a research career involves. This is the interoperability argument that funder-compliance framing misses entirely: the identifier is designed to outlast any single employment contract.

    The comparison researchers most often ask about is ORCID versus a professional networking profile such as LinkedIn. The two solve different problems, and conflating them undersells what ORCID does:

    Feature ORCID iD LinkedIn profile
    Governance Non-profit registry; researcher owns and controls the record Commercial platform; data used for advertising and platform value
    Persistent identifier Yes — a permanent 16-digit ID No — profile URL and account can change or be deleted
    System integration Connects to publisher, funder, repository and CRIS systems via API Not integrated with scholarly publishing or grant infrastructure
    Primary purpose Verified research attribution and provenance Professional networking and visibility

    Because the identifier — not the employer’s system — is the constant, a researcher who moves from a UK university to an EU institute, a US laboratory or an independent research organisation carries a single verifiable record of their contributions rather than starting a fresh profile each time.

    How Much Repeat Data Entry Does an ORCID iD Remove?

    Every grant application, manuscript submission, promotion case and institutional repository deposit historically asked a researcher to retype the same biography, employment history and publication list. An ORCID iD turns that one-time entry into a reusable record that other systems query rather than re-collect.

    Two concrete integrations illustrate the mechanism. Crossref’s auto-update service can push newly registered DOIs directly into a researcher’s ORCID record the moment a publisher deposits metadata, with no manual claiming required. In the United States, the NIH’s SciENcv tool draws on ORCID data to help assemble the biosketch required in grant applications, cutting a task that once meant reformatting a CV into every agency’s preferred template.

    UKRI illustrates why waiting for a mandate is the wrong strategy. UKRI has confirmed that linking an ORCID iD will become mandatory for project leads, co-leads and fellows on its Funding Service — but only once that functionality is available, expected in 2027, with a further six-month grace period before enforcement. Researchers who register now spend the next year building a complete, cross-referenced record; researchers who wait start that process from zero under a compliance deadline.

    Common Questions About ORCID iD Benefits

    Should I put my ORCID iD on my CV?

    Yes. Adding your ORCID iD to a CV, email signature, repository profile and manuscript submissions gives every reader a single, verifiable link to your full research record. It removes ambiguity for hiring committees, journal editors and collaborators checking your publication history.

    Does an ORCID iD replace a CV?

    No, but it reduces reliance on a static document. An ORCID record can hold employment history, education, works and peer-review activity that stays current automatically, while a CV remains a curated, formatted document tailored to a specific application.

    Is ORCID like LinkedIn?

    No. ORCID is a non-profit registry built for scholarly attribution and system interoperability, while LinkedIn is a commercial networking platform. They serve adjacent but distinct purposes and are not interchangeable for research provenance.

    Is it necessary to have an ORCID iD?

    It is not universally mandatory today, though an increasing number of funders and publishers require or strongly encourage one. The reputational and portability case for registering exists independently of any current or future mandate.

    The Bottom Line for Researchers Without a Mandate

    Treating an ORCID iD as a compliance item to defer until a funder forces the issue misreads what the identifier actually does. Its value is disambiguation that protects a researcher’s reputation, portability that survives every employer change, and a reusable data record that ends repetitive re-entry across grant, publication and repository systems.

    CASRAI originated the CRediT contributor role taxonomy in 2014 to make individual research contributions explicit and attributable; the standard is now stewarded by NISO as ANSI/NISO Z39.104-2022. CRediT roles and ORCID iDs are increasingly paired in publisher submission systems for the same reason: attribution only works when the identifier behind it is persistent, verifiable and independent of any one institution. Registering an ORCID iD now, ahead of pending mandates such as UKRI’s, is the lower-effort path to the same outcome.

  • Credit Taxonomy Authorship: A Case for Funder Adoption in Grant Reporting

    Opinion: grant reporting should require structured credit taxonomy authorship data alongside biosketches and final reports. Funders currently reward the named author list, not the research team that actually produced the work — and the CRediT roles already used by publishers are the readiest tool to fix that gap. This is a CASRAI perspective, not a report of confirmed funder policy: no major funder currently mandates it.

    The Contributor Role Taxonomy (CRediT) is a standardised set of 14 roles — from conceptualisation and data curation to funding acquisition and writing — used to describe who did what on a research output, distinct from the narrower question of who qualifies as an “author”. CASRAI originated CRediT in 2014; the standard is now stewarded by NISO as ANSI/NISO Z39.104-2022, and it is licensed CC-BY 4.0 for free reuse by anyone, including funders.

    What is the CRediT taxonomy, and why does grant reporting ignore it?

    CRediT is not an authorship test. It does not decide who qualifies as an author under criteria such as those set out by the International Committee of Medical Journal Editors (ICMJE); it describes contribution type once a research output exists. Publishers including Elsevier, Wiley and Taylor & Francis now require a CRediT statement at submission, mapping each named author to one or more of the 14 roles.

    Grant reporting sits entirely outside this system. A funder’s final report typically lists a principal investigator, co-investigators, and a project narrative — not a structured breakdown of who curated the data, who wrote the software, or who administered the project day to day. That gap matters because grant reports, not journal articles, are where funders form their view of “who delivered this award”.

    The case for funder-required credit taxonomy authorship data

    Three arguments support requiring CRediT-style data in grant reporting, not just at publication.

    • Credit for non-PI staff. Research software engineers, data managers, and postdoctoral researchers frequently deliver the technical core of a funded project without ever becoming a named co-investigator on the award. A contributor-role field in the final report creates an auditable record of that work, independent of authorship politics on any resulting paper.
    • Better evidence for funders’ own decisions. Funders assess renewal applications, track record, and “who can actually deliver” partly from CVs and biosketches. A structured role history — built cumulatively across a researcher’s funded outputs — is a more reliable signal than author position, which varies wildly by discipline and negotiation.
    • Continuity with ORCID. ORCID has supported CRediT role tagging on individual “Works” records since 2019. Extending the same structured field to the grant-reporting stage would let a researcher’s contributor history accumulate consistently across both outputs and awards, rather than resetting at each reporting boundary.

    None of this requires funders to redefine authorship. It only requires them to capture, at the reporting stage, data that publishers already collect at the publication stage.

    The administrative-burden counter-argument

    The strongest objection is not conceptual, it is operational. Grant reporting is already a compliance burden for research offices, and adding another structured field is not free.

    • Duplication risk. If contributor roles are recorded once at reporting and again at publication, teams will re-key the same information twice unless the two systems are linked via ORCID or a shared identifier.
    • Multi-institutional friction. Large consortium awards, common in Horizon Europe and UKRI-funded collaborations, involve dozens of contributors across institutions with different research-information systems; agreeing roles before a report deadline adds negotiation overhead.
    • Taxonomy fit. The 14 CRediT roles were designed for journal-article contributions. Some categories of grant-funded work — public engagement, infrastructure maintenance, cohort recruitment — map awkwardly onto the existing role list without local adaptation.

    These are real costs, not reasons to abandon the idea. They are reasons to pilot it narrowly and design the reporting field so it can be pre-populated from existing ORCID or publication CRediT data rather than entered from scratch.

    How grant reporting compares with today’s publisher practice

    The asymmetry between publication-stage and award-stage contributorship data is the core of the argument. It also happens to be an information gap most coverage of CRediT does not spell out.

    Stage / stakeholder Structured contributor-role data required today? Mechanism, where it exists
    Major journal publishers (Elsevier, Wiley, Taylor & Francis) Yes, at submission CRediT author statement mapping each author to one or more of 14 roles
    Grant final/interim reports (typical funder templates) No Narrative project summary and named investigator list only
    NIH biosketch No structured field Free-text “Contributions to Science” section
    ORCID “Works” record Optional, researcher-populated CRediT role tags supported since 2019
    This proposal (CASRAI perspective) Argued position, not existing policy A CRediT-derived contributor-role block appended to funder reports, pre-populated from ORCID where possible

    Answer-first questions on CRediT and author contributions

    What is funding acquisition in author contribution?

    Funding acquisition is one of CRediT’s 14 defined roles, covering acquisition of the financial support for the project that led to the published output. It is the single CRediT role most directly relevant to grant reporting, since it explicitly separates the person who secured the award from those who executed the research — a distinction current biosketch narratives rarely make clean.

    What are the criteria for author contribution?

    Under ICMJE criteria, authorship requires substantial contribution to the work’s conception or design (or data acquisition, analysis, or interpretation), drafting or critically revising the manuscript, final approval of the published version, and agreement to be accountable for it. CRediT does not replace these criteria; it sits alongside them to describe contribution type once authorship has already been determined.

    What are examples of author contributions?

    Typical CRediT-defined contributions include conceptualisation, data curation, formal analysis, funding acquisition, investigation, methodology, project administration, software, supervision, validation, visualisation, and the two writing roles — original draft, and review and editing. A single individual can hold several roles on one output.

    Implications for funders and institutions

    If funders moved toward requesting credit taxonomy authorship data in grant reports, research offices would need three things before a mandate could work in practice: an ORCID-linked pre-population mechanism to avoid double entry, a pilot cohort limited to a small number of funding calls, and explicit guidance that CRediT roles describe contribution, not authorship eligibility, so institutions do not over-interpret the data during promotion or tenure review.

    The honest case for funder adoption is incremental, not sweeping: pilot it on a subset of awards, link it to ORCID so it is populated once and reused, and treat early results as evidence rather than assuming the benefit before it is tested. Given that publishers already run this system at scale, the marginal cost of extending it one stage earlier, into grant reporting, is smaller than building a comparable structure from nothing.

  • Indigenous Data Sovereignty: Why FAIR Needs CARE

    Indigenous data sovereignty is the right of Indigenous peoples and nations to govern the collection, ownership, interpretation, and application of data about their own communities, lands, and knowledge. Blanket “open by default” research-data mandates built on the FAIR Data Principles can override that right when they treat findability and accessibility as unconditional. The fix is not to abandon FAIR, but to add a CARE-informed consent layer — tiered access controls, negotiated data-sharing agreements, and governance authority held by the originating community — that sits inside FAIR’s own accessibility principle rather than outside it.

    As funders push open-data compliance deeper into grant conditions, research offices increasingly reconcile a mandate to publish with a community’s right to say no, say later, or say “only under these conditions.”

    What is indigenous data sovereignty?

    Indigenous data sovereignty describes the inherent right of Indigenous peoples to govern data about their own communities, resources, and lands — a right that derives from tribal and national self-determination rather than from any single data-protection statute. The Global Indigenous Data Alliance (GIDA) traces the movement’s institutional roots to country-specific networks: the Aotearoa New Zealand-based Te Mana Raraunga (Māori Data Sovereignty Network, formed 2015), Australia’s Maiam nayri Wingara Aboriginal and Torres Strait Islander Data Sovereignty Collective (2017), Canada’s First Nations Information Governance Centre, and the US Indigenous Data Sovereignty Network.

    These networks converged on a shared position: data collected about Indigenous peoples should remain subject to the governance of the nation or community it describes — including tribal law — not solely the policies of the funder, institution, or repository that hosts it. This is a governance claim, not merely a privacy preference, and it applies whether the data in question is health records, environmental monitoring, ceremonial knowledge, or genomic samples.

    How do CARE principles relate to FAIR data principles?

    The CARE Principles for Indigenous Data Governance — Collective Benefit, Authority to Control, Responsibility, and Ethics — were developed specifically to sit alongside the FAIR Data Principles (Findable, Accessible, Interoperable, Reusable), not to replace them. The Research Data Alliance’s International Indigenous Data Sovereignty Interest Group formalised CARE in 2019 to address what FAIR, on its own, does not: who benefits, who decides, and under what ethical obligations data circulates.

    Principle set Primary question it answers Governing focus
    FAIR (Findable, Accessible, Interoperable, Reusable) How usable is the data, technically? Data as an object
    CARE (Collective Benefit, Authority to Control, Responsibility, Ethics) Who benefits, and who decides? Data as a relationship

    Framing these as rivals misreads FAIR’s own text. FAIR principle A1.2 explicitly states that the accessibility protocol must “allow for an authentication and authorisation procedure, where necessary” — meaning FAIR was never a synonym for unconditional open access. Data can be fully findable, with rich metadata, a persistent identifier, and a documented access route, while the underlying content sits behind a governed permission gate. That gap between “discoverable” and “downloadable” is precisely where a CARE-informed consent layer belongs.

    Do open data mandates override indigenous data sovereignty?

    Open data mandates do not automatically override Indigenous data sovereignty, but poorly designed ones can function that way in practice. Funder policies such as UKRI’s research data policy and cOAlition S’s Plan S commitments require data to be made available with “as open as possible, as restricted as necessary” language — a formulation that already anticipates legitimate restriction, yet is frequently implemented by institutions as a default push toward maximal openness.

    PLOS’s own editorial position, published in its EveryONE blog in October 2023, states plainly that Indigenous Data Sovereignty is the right of Indigenous peoples to own and govern data about their communities, resources, and lands — and that open-access publishing policies must accommodate, not override, that right through mechanisms such as data-access statements that explain restrictions rather than force disclosure. The Australian Institute of Aboriginal and Torres Strait Islander Studies (AIATSIS) Code of Ethics for Aboriginal and Torres Strait Islander Research similarly requires researcher agreements on data ownership, access, and storage to be negotiated with communities before collection begins, not retrofitted at publication.

    • Where mandates and sovereignty align: both frameworks require documented data-management plans, clear provenance, and persistent identifiers.
    • Where friction emerges: “open by default” clauses that treat non-disclosure as an exception requiring justification, rather than a governance decision requiring respect.
    • The resolvable middle: metadata and access statements can be fully open even when the underlying dataset is access-controlled.

    A consent layer is a set of governance and technical controls — inserted between data creation and data reuse — that lets a community set the terms under which its data is discovered, accessed, and re-used, without removing that data from the research record entirely. In practice this combines four elements research administrators already have tools for:

    1. Tiered metadata: a public, FAIR-compliant record (title, abstract, provenance, persistent identifier via DataCite or Crossref) that is fully findable even when the dataset itself is restricted.
    2. Governance-holder sign-off: a named Indigenous governance body (tribal council, iwi authority, data sovereignty collective) with authority to approve, condition, or decline each reuse request — not a one-time blanket consent captured at initial collection.
    3. A trusted research environment (TRE): a controlled-access computing environment where approved researchers can analyse restricted data without exporting raw records, satisfying reusability without unconditional distribution.
    4. Biocultural or Traditional Knowledge labels: machine-readable metadata tags (the Local Contexts initiative’s TK and BC Labels) that travel with a dataset to signal provenance, cultural protocols, and permitted uses wherever it is indexed or mirrored.

    None of these four elements block findability. They condition access — which is exactly what FAIR’s accessible principle already permits.

    Data sharing agreement vs data processing agreement — which applies?

    A data sharing agreement (DSA) and a data processing agreement (DPA) serve different legal functions, and conflating them is a common source of failure in Indigenous data governance. A DSA governs the transfer of data between two parties who each have independent authority over how it is subsequently used — the correct instrument for Indigenous data sovereignty, because it lets the originating community retain and exercise ongoing authority to control, per CARE’s second principle.

    A DPA, by contrast, is used when one party (a processor) handles data strictly on behalf of another (the controller) with no independent decision-making rights — the model built into contract templates under UK GDPR. Using a DPA where a DSA is required strips the originating community of ongoing authority.

    Instrument Who holds decision authority Fit for Indigenous data sovereignty
    Data Sharing Agreement (DSA) Both parties, independently Appropriate — preserves community authority to control
    Data Processing Agreement (DPA) Controller only; processor has none Inappropriate as a standalone instrument — reduces community to data subject

    Implications for research administrators

    Research data management (RDM) policy templates written purely around funder compliance checklists will systematically under-serve Indigenous data governance unless they build in a consent layer as a standard clause, not an exception process. Institutions should require, at the data-management-plan stage, an explicit question: does this dataset describe an Indigenous community, and if so, has a governance body with authority to control been identified and consulted before collection?

    Research data repositories that host Indigenous-derived datasets should support tiered access controls and TK/BC Label metadata natively, rather than treating restricted-access as a bespoke workaround bolted onto an open-by-default platform. Institutions building or procuring a trusted research environment for sensitive data should evaluate whether it can enforce community-set reuse conditions per dataset, not merely per project.

    Conclusion: consent is compatible with findability

    Indigenous data sovereignty and the FAIR Data Principles are not opposed frameworks competing for the same ground — FAIR governs how data is described and discovered, while CARE and a CARE-informed consent layer govern who decides what happens next. A research data management policy that hard-codes this distinction, uses the right agreement type for the right relationship, and gives Indigenous governance bodies a standing role rather than a one-off consultation, satisfies funder open-data requirements and Indigenous data sovereignty at the same time. The two are compatible by design; the mandates just need to stop assuming otherwise.

  • Has cOAlition S Retreated From Plan S Rules?

    cOAlition S has not abandoned the goal of full and immediate open access, but its 2026-2030 strategy drops the enforcement mechanism that made Plan S distinctive: financial support for transformative agreements ended after 2024, replaced by a looser, consultation-led push toward diamond open access and preprints. Science.org’s reporting calls this a retreat from strict requirements; cOAlition S calls it a “recalibration” of the same founding mission. Both are partly right, and research administrators deciding how much weight to put on the new targets need to understand exactly what changed.

    Plan S is the funder mandate, launched in September 2018 by cOAlition S, requiring that publications from publicly funded research be made immediately available under an open licence, without embargo, from 2021 onward. cOAlition S is the consortium of national and philanthropic research funders — including UKRI and the Wellcome Trust — that created and enforces that mandate.

    What Does the 2026-2030 Strategy Actually Change?

    The cOAlition S Strategy 2026-2030, adopted by the coalition’s Leaders Group in November 2025, keeps the founding commitment to full and immediate open access but widens the toolkit for getting there. Where the original Plan S centred on a single lever — funder mandates tied to compliance checks — the new strategy explicitly states that “no single model can meet all needs” and extends its focus “beyond mandates and funding conditions.”

    Three priorities anchor the plan: strengthening the foundations for sustainable and equitable open access (including an update to the Plan S principles to foreground Publish-Review-Curate models, diamond open access and preprints); supporting open digital infrastructures, including work on artificial intelligence’s implications for scholarly publishing; and exploring financially sustainable, non-APC publishing systems. Implementation runs in two phases — foundational work in 2026-2027, followed by a deeper equity and sustainability push in 2028-2030, subject to Leaders Group review.

    Why Does Science.org Call This a Retreat?

    Science.org’s analysis, headlined “After Coalition S disrupted scientific publishing, new plan retreats from strict requirements,” argues the new strategy has no teeth. Its central claim: cOAlition S is trading enforceable compliance rules for a broader, softer vision that favours alternatives to paywalled journals without committing to actually replace them.

    The magazine credits the original Plan S with helping push the global share of newly published papers appearing as open access above 50% within a few years of the 2021 mandate taking effect. But it also revisits a well-documented side effect: Plan S’s compliance route pushed many publishers toward author-pays gold and hybrid open access, and some prestigious journals now charge authors thousands of dollars per article while continuing to publish paywalled content elsewhere in the same title. A commentary from Science’s news desk on social media put the critique concisely: the latest strategy “emphasizes consultation, but lacks spending pledges.”

    • No new mandate deadlines are attached to the 2026-2030 priorities.
    • No enforcement or compliance-checking mechanism replaces the one built around transformative agreements.
    • Financial commitments are framed as exploratory (“investigate,” “monitor”) rather than binding.

    How Does cOAlition S Defend the New Strategy?

    cOAlition S rejects the framing of “retreat” outright. Its own communications describe the strategy as reinforcing, not loosening, its open access commitment, under a refreshed vision of “a scholarly communication system that enables rapid, open, transparent, and equitable sharing of trustworthy scientific knowledge.”

    The coalition points to concrete institution-building as evidence of continuity rather than disengagement: it appointed Curt Rice — former rector of Oslo Metropolitan University and the Norwegian University of Life Sciences, and former Executive Director of Fulbright Norway — as its first standing Director, announced 13 May 2026, specifically to lead delivery of the 2026-2030 strategy. It has also named OPERAS, the European research infrastructure for open scholarly communication, as its new Host Secretariat, and it co-produced the Bengaluru Roadmap and Action Plan on Diamond Open Access at the 3rd Global Summit on Diamond Open Access. None of that reads as an organisation stepping back — it reads as one restructuring around a different theory of change: build sustainable, non-commercial infrastructure rather than police compliance.

    What Happens to Transformative Agreements?

    Transformative agreements — the “read and publish” deals between institutions and publishers designed to convert subscription spend into open access output — are the clearest casualty of the shift. cOAlition S confirmed the end of its financial support for open access publishing under transformative arrangements after 2024, having already stopped accepting new applications to the programme after 30 June 2023.

    In their place, the 2026-2030 strategy channels investment toward diamond open access — journals and platforms that charge neither authors nor readers — and toward preprint infrastructure. Diamond open access is a publishing model funded through institutional, library-consortium or public grants rather than per-article charges, positioned by cOAlition S as the more equitable long-term alternative to both subscription paywalls and high-cost APCs.

    Mechanism Status under Plan S (2018-2024) Status under 2026-2030 strategy
    Transformative agreements Funded as a transitional route to compliance Funding ended after 2024; no new applications since June 2023
    Diamond open access Encouraged, not prioritised Named strategic priority, backed by the Bengaluru Roadmap
    Compliance mandate Immediate OA required from 2021, checked via the Journal Checker Tool Principles retained, but no new binding deadlines set
    Governance Coordinated informally among funders Standing Director (Curt Rice) and OPERAS-hosted Secretariat

    Answer-First Questions on Plan S and cOAlition S

    What Is Plan S?

    Plan S is a funder-led initiative, launched in September 2018, requiring that publications resulting from publicly funded research be published in open-access journals, on open-access platforms, or deposited in open repositories immediately, without embargo. It is supported by cOAlition S, an international consortium of national and philanthropic research funders.

    What Is the Main Principle of Plan S?

    The core principle is that, from 2021, all scholarly publications funded by public or private grants from participating funders must be made immediately available in open access, without embargo, under an open licence — typically CC BY. That mandate remains unchanged in the 2026-2030 strategy; what has changed is how compliance is supported.

    Is Open Access Always Free for Everyone?

    No. Open access guarantees free reading access, not free publishing. Under the author-pays model that expanded alongside Plan S compliance, many journals shifted costs onto authors through article processing charges, which critics — including Science.org — argue created a new equity problem the 2026-2030 strategy now explicitly tries to address through diamond open access.

    What Does This Mean for Institutions and Publishers?

    For research administrators and institutional leaders, the practical takeaway is that Plan S’s headline compliance requirement has not disappeared — the Journal Checker Tool still governs how researchers assess eligible venues — but the financial pressure that pushed publishers into transformative agreements has been withdrawn. Institutions currently relying on transformative deals negotiated with cOAlition S funding in mind should not assume renewal on the same terms.

    Publishers, meanwhile, face a genuine strategic fork: continue investing in APC-based hybrid and gold open access, where cOAlition S funding is no longer available, or build toward diamond and Publish-Review-Curate models that better match the coalition’s stated 2028-2030 priorities. Institutions tracking funder mandates and compliance timelines through their research administration functions will find this shift material to budget planning, not just messaging.

    Neither “retreat” nor “recalibration” fully settles the argument. Science.org is correct that the new strategy carries no new enforcement mechanism and no fresh spending pledge. cOAlition S is correct that its founding mandate — immediate, unembargoed open access — has not been withdrawn on paper. The honest reading sits between the two: cOAlition S has traded a narrower, harder lever for a broader, softer one, betting that infrastructure and diamond open access will do the work that compliance deadlines used to do. Whether that bet pays off will be visible well before 2030, in whether diamond open access funding actually scales and whether APC inflation slows without a mandate forcing the issue.

  • Pro-Innovation AI Regulation: Three Years On

    A pro-innovation approach to AI regulation delivered exactly what its title promised for UK research institutions: no new AI regulator, no statutory duty, and continued reliance on existing bodies. Three years on, universities gained substantial research funding and an AI sandbox model, but the dedicated AI Act many assumed would eventually follow has still not arrived — even as the EU quietly loosens its own.

    A pro-innovation approach to AI regulation is the UK government’s March 2023 white paper setting out a non-statutory, principles-based framework in which existing sector regulators — rather than a new central AI authority — apply five cross-sectoral principles to AI use within their own remits.

    What the white paper actually promised research institutions

    Published by the Department for Science, Innovation and Technology on 29 March 2023, the white paper explicitly rejected an EU-style AI Act. Instead, it committed to five non-statutory principles — safety, security and robustness; appropriate transparency and explainability; fairness; accountability and governance; and contestability and redress — for regulators to interpret within existing remits.

    For research institutions, three commitments mattered most: a central government function to monitor cross-cutting AI risk, AI “sandboxes” allowing controlled real-world testing, and an explicit acknowledgement that foundation models developed or fine-tuned inside universities would fall under the same principles as commercial deployments. The paper also floated a fallback: if voluntary compliance proved insufficient, government reserved the option to introduce a statutory duty requiring regulators to have regard to the principles.

    What has materialised, three years on

    Judged against its own text, the framework has been substantially delivered — but the research-funding side of the ledger moved faster and further than the regulatory side.

    2023 white paper commitment Status by mid-2026 Detail
    Five non-statutory principles applied by existing regulators Delivered ICO, CMA, FCA and Ofcom apply the principles via the Digital Regulation Cooperation Forum; no new central AI regulator was created
    Central risk-monitoring function Delivered, narrowed The AI Safety Institute launched in November 2023 was renamed the AI Security Institute in February 2025, with bias and fairness work explicitly dropped from its remit
    AI sandboxes for controlled testing Delivered via new vehicle Cross-economy AI sandboxing powers now sit in the Regulating for Growth Bill, announced in the May 2026 King’s Speech, rather than in standalone AI legislation
    Statutory duty on regulators (fallback option) Not introduced No statutory duty to regard the principles has been legislated; the non-statutory model remains in force
    Research funding to build AI capability Delivered and exceeded UKRI committed £80 million to nine AI research hubs (February 2024) and £117 million to 12 AI Centres for Doctoral Training training around 900 PhD students, plus up to £60 million for two further labs at Oxford and UCL
    Dedicated AI Act Not delivered As of July 2026, no AI Bill has been laid before Parliament by government

    The research-funding commitments arguably over-delivered relative to the white paper’s own modest framing, which discussed capability-building only in general terms. The regulatory commitments, by contrast, tracked the white paper almost exactly: light-touch, sector-led, and still without primary legislation.

    Does the UK have an AI Act yet?

    No. The House of Commons Library’s research briefing, last updated 10 June 2026, states plainly that “the UK does not have any AI-specific regulation or legislation covering AI as a technology” — AI is instead regulated only through the lens of whatever sector or use case it appears in.

    The clearest signal of intent came in the May 2026 King’s Speech, where government introduced the Regulating for Growth Bill rather than a standalone AI Act. The Bill creates cross-economy sandboxing powers — explicitly covering AI-enabled products and services — and strengthens regulators’ existing “growth duty.” This is the 2023 white paper’s sandbox-and-existing-regulators architecture, carried into the one piece of legislation government did choose to bring forward, rather than superseded by it.

    The contrast with the EU sharpened rather than narrowed in 2026. Under the Digital Omnibus on AI, agreed by the Council and Parliament on 7 May 2026 and formally endorsed on 16 and 29 June 2026, the EU deferred applicability of high-risk obligations for standalone Annex III AI systems from 2 August 2026 to 2 December 2027 — a sixteen-month delay — and for product-embedded Annex I systems to 2 August 2028. The bloc that legislated first is now easing its own timetable in the same direction the UK chose from the start: slower, more sector-specific, less prescriptive. For research institutions running UK-EU collaborative projects, this means the compliance gap between the two regimes has narrowed in practice even as it remains wide in principle — EU partners still face a statutory Act; UK partners still do not.

    Answer-first Q&A

    Is there any regulation on AI in the UK?

    Yes, but not AI-specific regulation. AI use is governed by existing sectoral law — UK GDPR and the Data Protection Act for data processing, equality law for discrimination, and regulator guidance from the ICO, CMA, FCA and Ofcom applying the white paper’s five principles within their own remits.

    What are the guidelines for AI in the UK?

    The core guidelines are the white paper’s five cross-sectoral principles: safety, transparency, fairness, accountability and contestability. Regulators translate these into sector guidance; the ICO’s AI guidance and UKRI’s generative-AI guidance for grant applications and peer review are two research-relevant examples.

    Does the UK have any laws on AI?

    The UK has no AI-specific statute. AI-related legal obligations instead arise from existing frameworks — data protection, product safety, equality and sector regulation — applied to AI as a use case, a position the Commons Library confirmed again in its June 2026 briefing.

    What is the AI legislation in 2026?

    The main 2026 development is the Regulating for Growth Bill, announced in the King’s Speech, which creates cross-economy AI sandboxing powers and strengthens regulators’ growth duty. It is not a dedicated AI Act and does not replace the 2023 white paper’s non-statutory model.

    What this means for research administrators

    For institutions managing research integrity, ethics review and international collaboration, the practical position has not changed since 2023: there is still no single AI compliance regime to point to. Research offices assessing AI use in grant applications, peer review or data processing must continue mapping obligations across data protection, funder policy and sector guidance individually, rather than against one statute.

    • UKRI’s generative-AI guidance for grant applications and peer review remains the most directly applicable research-specific rule set.
    • The AI Security Institute’s narrowed remit means bias and fairness concerns in research AI tools sit with the ICO and funders, not a national safety body.
    • Cross-border projects with EU partners should track the Digital Omnibus’s revised 2027–2028 timetable separately from any UK sandbox rollout under the Regulating for Growth Bill.
    • No statutory duty exists yet requiring UK regulators to apply the five principles consistently, so guidance can still vary by sector and by regulator.

    The verdict, three years on

    The 2023 white paper’s central bet — that voluntary, principles-based, regulator-led governance would prove durable rather than a stopgap before statute — has held. Government has repeatedly reaffirmed rather than abandoned that bet, most recently by routing AI sandboxing through the Regulating for Growth Bill instead of standalone legislation. Research institutions received the funding side of the promise in full and then some; they received the regulatory side almost exactly as written, for better or worse. Whether that remains defensible depends on what the EU’s now-softening Act ends up looking like once its delayed obligations finally bite in December 2027 — at which point the UK’s three-year wait for clarity may look either prescient or merely prolonged.

    Research administrators tracking these obligations alongside authorship, funder mandates and evolving research-integrity standards can find related context in CASRAI’s research administration resources.