CASRAI Dictionary

Category: Uncategorized

Sustainable laboratory operations: LEAF, My Green Lab, and the carbon footprint of research
The carbon footprint of research is unusually large per worker. A typical wet lab consumes 3-10 times the energy of an equivalent office space; a single ultra-low-temperature freezer running 24/7 uses as much electricity as an average household; single-use plastics in life-science labs alone are estimated globally at 5.5 million tonnes per year. A research-intensive university’s Scope 1 and 2 emissions sit primarily in lab buildings; its Scope 3 sits primarily in travel and procurement. The sustainability conversation in 2026 has moved from awareness to operational programmes, with two frameworks dominating: LEAF (the Laboratory Efficiency Assessment Framework developed at UCL) and My Green Lab certification. This post is a practical tour.

Why labs are different

Office sustainability programmes (LED lighting, paper recycling, energy-efficient computing) translate poorly to laboratories because the energy intensity is in equipment that cannot simply be switched off. A freezer holding biological samples cannot be turned off at night; a fume hood cannot be reduced to standby; a sequencer cannot run at half power. Lab sustainability is therefore primarily a question of which equipment runs and how it is operated, not whether it is on.

The corollary is that lab sustainability programmes need a different vocabulary and a different evidence base than office sustainability. Both LEAF and My Green Lab were designed in response to this; both have been validated empirically over the 2018-2024 period.

LEAF

The Laboratory Efficiency Assessment Framework, developed at University College London by Martin Farley and colleagues, is a self-assessment-and-certification framework structured at three levels (Bronze, Silver, Gold). LEAF’s strength is that it is operationally specific: each level lists discrete actions a lab can take, and the actions are tied to estimated energy and waste impacts based on UK lab benchmarks.

The LEAF Bronze level covers basics: freezer temperature optimisation, sash management on fume hoods, equipment shutdown protocols, recycling, lighting, water conservation. Silver adds: research-life-cycle assessment, supplier engagement, training, advocacy. Gold adds: integrated sustainability planning, leadership in the institution’s sustainability programme, mentorship of other labs.

By 2026 LEAF is in use at over 1,000 institutions globally, with strong concentration in the UK (where it was developed and has the strongest institutional backing). The LEAF self-assessment is free; the certification process involves institutional review. The framework is open and has been adapted to several national contexts.

The LEAF impact data

The 2023 UCL study of LEAF Bronze-certified labs found average energy reductions of 5-15% versus baseline, primarily from freezer optimisation and fume-hood sash management. The 2024 follow-up at LEAF Silver labs found additional 5-10% reductions and significant reductions in single-use plastic consumption through supplier engagement. The data underwrite the framework: this is not aspirational, it is documented.

My Green Lab

My Green Lab is a US-based non-profit operating a complementary certification programme that has dominated the North American market and increasingly the international one. My Green Lab Certification (MGLC) is built around a survey assessment in 14 categories (energy, water, waste, green chemistry, purchasing, training, etc.) with a numerical score and an annual recertification cycle.

My Green Lab also operates the ACT Label (Accountability, Consistency, Transparency) for lab products: a vendor-supplied environmental-impact rating for an individual product (pipette tip, plate, reagent) covering its energy, water, packaging, and chemical inputs. The ACT Label is in widespread use across major lab suppliers (Eppendorf, Thermo Fisher, Bio-Rad, NEB) and has become a discriminator at the procurement stage for sustainability-minded labs.

By 2026 My Green Lab claims over 4,000 certified labs across more than 70 countries; the certified-lab cohort is concentrated in pharmaceutical and biotechnology industrial research, with growing academic uptake.

Choosing between LEAF and My Green Lab

Most institutions can choose one and stick with it. LEAF is more prescriptive (the levels list specific actions); My Green Lab is more diagnostic (the survey identifies areas for improvement and tracks scores over time). LEAF has stronger UK and European institutional backing; My Green Lab has stronger US and pharmaceutical-industry backing. An institution coordinating with a major US pharmaceutical partner is likely better off with My Green Lab; one coordinating with UKRI or EU funders is likely better off with LEAF.

Some institutions run both. The duplicate-overhead cost is moderate (the underlying lab practices are largely the same; the documentation is different) and the dual recognition can be useful.

Freezer management: the biggest single lever

The single most impactful intervention in most labs is freezer management. A typical -80°C ultra-low-temperature freezer consumes 16-22 kWh/day. Switching to -70°C set point (acceptable for most stored samples per the 2018 Gail Phébré et al. validation) cuts consumption by 30-40%. Combining with a sample-inventory audit (most labs have 10-30% of freezer contents that are unused, unlabelled, or duplicated and can be discarded) frees space and avoids new freezer purchases.

The UCL Race to Zero trial in 2022-2023 had 53 labs switch -80°C freezers to -70°C; the energy savings were as predicted and no sample integrity issues were reported across 12+ months of follow-up. This is now standard guidance in both LEAF and My Green Lab.

Single-use plastics

The wet-lab single-use-plastic flow is enormous and largely necessary (sterility, contamination control, reproducibility). The mitigation in 2026 has two prongs. First, vendor switching: products with ACT Labels in the better tiers (post-consumer recycled content packaging, take-back programmes, reduced primary packaging) materially reduce flows. Second, recycling streams: rigid PE/PP lab plastics (tip boxes, conical tube racks) are recyclable in dedicated lab-plastic streams operated by several vendors. The recycling capture rate has grown substantially since 2022.

The remaining hard problem is contaminated plastics (anything that has touched biological or chemical materials and cannot be cleanly recycled). The mitigation is procurement-stage: smaller tip volumes, single-use serological pipette redesigns with less plastic per unit, reusable glassware where appropriate.

Sustainable HPC

High-performance computing is the fastest-growing emissions source in many research universities. A modern HPC cluster running 24/7 at high utilisation has a substantial Scope 2 footprint; AI-training workloads in particular have caused HPC electricity consumption to grow sharply since 2022.

The mitigations in 2026 include: power-aware job scheduling (running flexible jobs when grid carbon intensity is low, e.g., when wind generation is high); efficiency-first allocation (prioritising jobs that have demonstrated CPU/GPU efficiency); ML-model-efficiency policies (preferring smaller, more efficient models where they suffice); reporting emissions per project in the same way that we report compute hours.

The Green Algorithms tool and the CodeCarbon Python library let researchers estimate emissions per analysis. UKRI and the EU’s HORIZON programme now ask researchers to report estimated emissions in proposals for compute-heavy projects.

Conference travel: the Scope 3 elephant

Academic conference travel is, for most research-intensive universities, the single largest Scope 3 emissions category. A round-trip transatlantic flight emits roughly 1-2 tonnes of CO2 per passenger; an academic with a typical conference cadence can easily account for 5-10 tonnes/year of travel emissions, which dwarfs everything else they personally consume.

The 2020-2024 pandemic enforced a partial shift to virtual conferencing; the post-pandemic settlement has not held. By 2026 conference travel is largely back to pre-pandemic levels, though with somewhat more hybrid options. The frameworks that have emerged include: institutional travel-budget caps with carbon-equivalent accounting; conference-clustering (attending one trip with multiple events rather than several separate trips); flight-free regional conferences (the UK Reproducibility Network’s flight-free Easter conference, the European Geosciences Union’s hybrid format); and proportional-attendance models in which junior researchers attend in person while seniors attend virtually.

The conference travel emissions conversation is genuinely difficult because there are real career and equity costs to reducing in-person attendance. The current best practice is to count, declare, and make trade-offs visible, rather than to impose a top-down quota.

Scope 1, 2, 3 in research-org context

For a research institution: Scope 1 is direct (campus heating fuel, owned vehicles); Scope 2 is purchased energy (electricity, district heating); Scope 3 is everything else (travel, procurement, commuting, waste, investments). For a typical research-intensive university, Scope 3 is 70-90% of total emissions, with travel and procurement dominating. The implication is that a serious sustainability programme must address Scope 3 procurement (sustainable lab purchasing) and Scope 3 travel (conferences and fieldwork), not just on-campus operations.

The sustainable-research domain at CASRAI tracks framework adoption and institutional case studies; the research carbon footprint entry walks through the standard accounting methodology adapted for research organisations.

Related dictionary entries
References

Farley et al., LEAF: a tool for laboratory sustainability assessment (UCL technical report, 2019, updated 2023). My Green Lab, 2024 Certification Standard (current version). Urbina et al., Labs should cut plastic waste too (Nature, 2015, the foundational plastics paper). Lannelongue et al., Green Algorithms: Quantifying the carbon footprint of computation (Advanced Science, 2021). UCL Race to Zero, Freezer temperature transition report (2023).
December 4, 2025
Why the next CRediT version should include ‘AI assistance’ as a role
The 14 roles of CRediT were designed in 2013-2014 with a model of contribution that did not include large language models or generative AI systems. A decade on, the taxonomy is robust and widely adopted, but the AI question is hard to ignore. This post makes the case — tentatively, and with attention to the counter-arguments — that the next CRediT revision should add a 15th role explicitly covering AI assistance. We are publishing it here to invite community pushback before any formal proposal goes to the CRediT stewardship group.

Why this question is not solved by disclosure alone

The current consensus around generative AI in scholarly authorship rests on two pillars: AI cannot be a co-author (the ICMJE 2023 position), and AI use must be disclosed in a structured declaration. CASRAI agrees with both. They do not, however, resolve the question of how AI assistance shows up in CRediT.

A worked example. Suppose a paper has four authors. Author A wrote the first draft with substantial assistance from a large language model, which she prompted, edited, fact-checked, and revised. Author B ran the formal analysis using an AI-assisted statistical-discovery tool that proposed model specifications. Author C generated several of the figures using a GenAI visualisation tool. Author D supervised. Each used AI; each used it differently; each took human responsibility for the output. How does the CRediT statement represent this?

Under current CRediT, AI use is invisible. Author A gets Writing – original draft (lead). Author B gets Formal analysis (lead). Author C gets Visualization (lead). Author D gets Supervision. The AI assistance shows up only in the publisher-mandated AI disclosure, which is a free-text field in the methods or acknowledgements. The structured contributorship record has no place for the granular fact that AI was a tool in each of those role-discharges.

The proposed 15th role

The draft scope we are testing is this:

AI assistance. The use of artificial-intelligence systems, including generative AI, machine-learning models, and automated analytical tools, in the production of the work. Includes prompt engineering, model selection, validation of AI output, and human verification of AI-generated content. Does not include use of AI as a routine tool (e.g., grammar checkers, citation-formatting tools) below a disclosure threshold defined by the publisher.

The role would carry the standard degree-of-contribution qualifier. A human author whose primary contribution was prompting and verifying an AI system would be marked Lead for AI assistance; a co-author who occasionally checked AI outputs would be Supporting. The role would not be a substitute for the existing roles — the human who used AI for the first draft still gets Writing – original draft — but it would add the structured fact that AI was involved.

The arguments for

First, structured disclosure is more useful than prose disclosure. A free-text AI declaration cannot be queried, cross-referenced, or aggregated. A CRediT-style structured role can. Integrity offices investigating a fabrication can query for papers with AI assistance roles; funders tracking AI use in grant outputs can roll up the data; bibliometric studies can analyse patterns. None of this is possible with the current free-text disclosure.

Second, granularity matters for accountability. Knowing that a paper used AI is less useful than knowing which contributor used AI for which task. The CRediT role assignment makes the accountability specific. If a fabricated reference appears in the introduction, the question of who is responsible for verifying it has a structured answer.

Third, the boundary is becoming a fiction. Modern statistical workflows include AI components (autoML, AI-assisted exploratory analysis); modern writing workflows include AI components (Copilot for prose, Claude for editing); modern visualisation workflows include AI components. The pretence that these are separable from the role they support is increasingly hard to maintain. If AI is being used to discharge a role, the role assignment should say so.

The arguments against

Three serious counter-arguments deserve engagement.

First, the scope-creep concern. CRediT has held to 14 roles deliberately. Each addition raises the cognitive load on authors filling out the statement, increases the integration burden on publishers, and risks the taxonomy becoming unusable through over-specification. The argument from Liz Allen and the original CRediT designers has been that the taxonomy gains its value from being small enough to use.

Second, the boundary problem. What counts as AI assistance? A grammar checker is plausibly AI; a citation formatter increasingly is; a search engine ranking results by relevance certainly is. If every modern research tool counts as AI, the role becomes meaningless. A workable scope requires a non-trivial threshold (the draft language above gestures at “below a disclosure threshold defined by the publisher”), and that threshold is hard to define without ending up with either everything or nothing.

Third, the disclosure-versus-contribution distinction. CRediT is a contributorship taxonomy. AI is not a contributor — that is the settled position. Adding an AI role to CRediT risks blurring this. The alternative is to keep AI in a separate disclosure form, structurally similar to a competing-interests declaration or a funding statement, rather than in the contributorship statement.

A possible middle path

The middle path is to keep CRediT at 14 roles and to define a parallel AI assistance declaration with comparable structure: a controlled vocabulary of AI-use types, a per-contributor breakdown linked to ORCID iDs, a model-and-version field, and a verification statement. This would sit alongside CRediT in publisher submission systems and JATS XML, rather than inside it.

This is closer to where the current publisher disclosure forms are heading, and it preserves the conceptual clarity that CRediT roles describe what humans did, while a separate declaration describes what AI tools were used. We are increasingly inclined to recommend this path, with the caveat that the disclosure must be structured to the same standard as CRediT — not free-text, with controlled vocabularies, deposited to Crossref, and surfaced on ORCID.

What the CRediT stewardship group should do next

Three concrete steps. First, run a structured community consultation through 2026 on whether to add AI assistance as a 15th CRediT role, with the alternative being a parallel structured declaration. The CRediT governance page outlines the consultation process. Second, in parallel, draft the data model for a parallel AI assistance declaration so that the comparison is concrete and not abstract. Third, coordinate with NISO on whether either option requires a revision to Z39.104.

The decision is not urgent in the sense that the integrity system is failing today; the existing disclosure forms work, badly. It is urgent in the sense that every year of delay produces another year of unstructured AI-use data that cannot be aggregated or analysed, which makes the eventual transition harder.

Related dictionary entries
December 3, 2025
NSPM-33 disclosure: what US researchers must report in 2026
National Security Presidential Memorandum 33 (NSPM-33), signed in January 2021, directed US federal research funding agencies to strengthen and harmonise disclosure requirements for federally funded researchers. Five years later the implementation has stabilised across NIH, NSF, DOE, DOD, NASA, USDA, and the other major science agencies, with the CHIPS and Science Act of 2022 having added enforcement teeth and the 2024 Research Security Programs Standard Requirement having added institutional-level obligations. This post is the practical 2026 compliance map for US-funded researchers.

The shape of NSPM-33 in 2026

NSPM-33’s core mandate is straightforward: a federally-funded researcher must disclose all support they receive (financial, in-kind, or in the form of positions, appointments, or affiliations) so that the funding agency can identify potential conflicts of commitment, undisclosed foreign components, or scientific overlap. The disclosure is made at proposal stage and updated throughout the project’s life.

The five years of implementation have produced two important refinements. First, the common disclosure forms: NIH’s Other Support format, NSF’s Current and Pending (Other) Support, and parallel formats at other agencies have been substantially harmonised under the NSPM-33 implementation guidance. By 2026 a researcher can largely produce one structured disclosure record (typically in SciENcv format) and have it serve all federal agencies. Second, the structured-data submission: the agencies now require disclosure forms in machine-readable format with ORCID linkage, not as free-form PDFs.

What must be disclosed

The 2026 disclosure scope at the major agencies covers, at a minimum:
- All ongoing and pending research support (federal, non-federal, and foreign).
- All in-kind support of significance (laboratory space, equipment access, personnel time).
- All positions and appointments (professorships, visiting positions, advisory roles, board memberships) regardless of whether they are paid.
- All consulting arrangements above a defined threshold (typically a few thousand dollars per year, but agency-specific).
- Foreign government talent recruitment programme participation (see below).
- Patents and patent applications related to the funded research.
- Sponsored or paid travel above defined thresholds.
- For NIH specifically, all support for research effort regardless of how titled.
The Current and Pending Support form (NSF terminology) and the Other Support form (NIH terminology) are the canonical artefacts. They are populated by the researcher at the proposal stage and re-verified at the just-in-time (JIT) request stage if the proposal is funded.

The foreign-component question

The single most consequential 2021-2024 enforcement focus was undisclosed foreign components. A foreign component is any significant scientific element of a project performed outside the United States by any source of funding, including foreign collaborator efforts even if not separately funded.

NIH’s foreign-component disclosure rule existed before NSPM-33 but was inconsistently enforced. Post-NSPM-33 the enforcement has been substantial: dozens of researchers had grants terminated or returned, and several criminal cases proceeded for fabricated disclosures. The 2023-2024 cohort of cases clarified the threshold: an undisclosed foreign-funded position, a foreign-government talent-recruitment-programme membership, or a substantial unreported collaboration with a foreign laboratory are all material non-disclosures with grant-termination and criminal consequences.

In 2026 the practical rule is conservative: if you have any affiliation, position, support, or significant collaboration outside the US that overlaps in time with your federal-funded project, disclose it. The cost of over-disclosure is filling in more forms; the cost of under-disclosure has become very high.

Foreign Talent Recruitment Programmes

The Foreign Talent Recruitment Programme (FTRP) category was sharpened by Section 10632 of the CHIPS and Science Act of 2022, which required agencies to prohibit federally-funded researchers from participating in malign FTRPs. The 2024 implementation guidance defined a malign FTRP as one that involves transfer of intellectual property, transfer of laboratory resources, or compensation contingent on outcomes that benefit a foreign government’s national interests, among several other criteria.

The category is narrower than the original 2018-2021 “China Initiative” framing might have suggested. Participation in a non-malign FTRP (a competitive postdoctoral programme, an academic exchange visit, an honorary professorship) is not prohibited but must be disclosed. Participation in a malign FTRP is prohibited for federally-funded researchers and must be terminated as a condition of receiving federal funding.

The institutional-side burden under the 2024 Research Security Programs Standard Requirement is substantial: institutions over a defined funding threshold must implement a research security programme with training, conflict-of-interest screening, foreign-collaboration approval, and ongoing monitoring. The standard requirement specifies the elements; institutions implement them with their own policies.

The reporting workflow in practice

The 2026 workflow for a federally-funded researcher at a US institution typically looks like:
1. SciENcv profile. Maintain a current SciENcv profile with all positions, appointments, and support. SciENcv (Science Experts Network Curriculum Vitae) is the federal-government-supported tool and produces the structured-data formats accepted by NIH, NSF, and other agencies.
2. Proposal-stage disclosure. Export the relevant disclosure form from SciENcv at proposal preparation. Verify with the institution’s sponsored-research office before submission.
3. JIT update. For NIH, re-verify Other Support at JIT request. Any changes since proposal submission must be reported.
4. Award updates. Any new support, position, or appointment acquired during the award must be reported to the agency. NIH’s threshold is “significant changes”; in practice, disclose anything that would have been on the original form.
5. Annual progress reports. RPPR and other annual reporting captures updated Other Support and current-and-pending. Treat this as a real update, not a copy-paste.
6. Final reports and closeout. Disclosure obligations continue through closeout.
Institutional research security programmes

The 2024 Research Security Programs Standard Requirement obligates institutions over the $50M annual federal-research threshold to operate a research security programme covering: cybersecurity training, foreign-collaboration approval workflow, conflict-of-interest and conflict-of-commitment training, export-control compliance, and ongoing monitoring of researchers’ disclosures against external data sources.

The institutional layer matters because most disclosure failures are not fraud; they are inadvertent omission by researchers who did not realise an affiliation was disclosable. A well-functioning research security programme acts as a backstop, with regular reminders, training, and a pre-submission review that catches omissions before they become non-disclosures of consequence.

The CASRAI funder-mandate guide covers the agency-specific disclosure requirements with current links; the research-security domain tracks the cross-agency policy harmonisation.

What’s still uncertain

Three areas remain in active interpretation in 2026. First, the treatment of dual-affiliated researchers: a researcher with a tenured position at a US institution and a part-time appointment at a non-US institution must disclose both, but the threshold for the non-US appointment counting as a foreign component is fuzzy in practice. Second, the scope of the conflict-of-commitment definition: an unpaid advisory role at a foreign institution may not count as support but does count as commitment; the agencies vary in how they treat this. Third, the retroactive application: disclosure failures discovered years after the funded work was completed have been treated with substantial inconsistency, with some cases pursued criminally and others handled administratively.

For researchers, the safe path is conservative disclosure, current SciENcv maintenance, and proactive consultation with the institution’s sponsored-research office whenever an affiliation or support is ambiguous. The compliance cost of asking is low; the cost of under-disclosure that surfaces later is potentially career-ending.

Related dictionary entries
References

NSTC Joint Committee on the Research Environment, Guidance for Implementing National Security Presidential Memorandum 33 (January 2022). NIH Office of Extramural Research, Notice of Information: Updates to Other Support (NOT-OD-22-150 and subsequent updates). CHIPS and Science Act of 2022, Section 10632 (Foreign Talent Recruitment Programs). OSTP, Research Security Programs Standard Requirement (July 2024). NSF, Proposal and Award Policies and Procedures Guide (current version).
November 26, 2025
Crossref’s grant-linking initiative and CRediT: a 2025 status
Crossref’s Grant Linking System (GLS), in development since 2019 and in steady production since 2022, has quietly become one of the most useful bits of plumbing in scholarly metadata. Its 2025 expansion — covering more funders, more grant metadata, and tighter integration with Crossref’s content-registration deposit schema — is worth a closer look, particularly for anyone integrating CRediT contributorship with funding attribution. This post walks through what GLS does, what changed in 2025, and how a CRediT integrator should consume the data.

What GLS is, briefly

A grant, as a thing in the world, has a funder (an organisation that paid), one or more awardees (people and institutions), a title, a project description, an amount, a duration, and a set of outputs (papers, datasets, software, other artefacts) that the grant produced. Pre-GLS, each piece sat somewhere different. The funder lived in the Funder Registry (a Crossref-maintained list of funder organisations). The grant number was a free-text string in publisher metadata. The outputs were registered with DOIs at Crossref but without structured links back to the grant. The result was that the graph from funder through grant to output existed only in fragments.

GLS provides the missing middle layer. A funder registers grants with Crossref via a dedicated deposit schema; each grant gets a DOI; the grant DOI carries structured metadata about the funder, the project, the awardees (with ORCID iDs where available), and the institutions (with ROR IDs). Publishers and other depositors then reference the grant DOI from the output’s metadata, closing the loop.

2025 expansion: more funders, more metadata

The major story of 2025 was funder participation. Through 2024 the GLS depositors were a small set of early adopters (Wellcome, the Templeton Foundation, ANR, a handful of others). 2025 added the major UK research councils (UKRI’s component councils now register grants via GLS), several EU H2020 and Horizon Europe streams (via OpenAIRE-mediated deposit), the Australian Research Council, and — significant for the US ecosystem — initial NSF and NIH pilots. NIH’s pilot is small (a few thousand R01 grants), but it signals direction.

The metadata expansion was equally important. GLS 2025 added structured fields for: project abstract (free text but indexed), discipline classification (using Crossref-curated CODE FOR codes and OECD FoS), expected outputs, ethics-board identifiers, and — the field most relevant to CRediT integrators — a participants structure that names each grant participant with an ORCID iD and a role from a controlled vocabulary (principal investigator, co-investigator, collaborator, named researcher, fellow, named staff). This is not CRediT, but it interlocks with CRediT cleanly.

The CRediT-GLS interlock

Here is the integration pattern that is now possible. A grant is registered with GLS, getting a DOI and a structured participant list. A paper acknowledges the grant by including the grant DOI in its Crossref deposit. The paper’s JATS carries a CRediT contributor statement, which is also deposited to Crossref via the relationships block. ORCID consumes both deposits via the public API and can now answer: this researcher contributed to this paper in these CRediT roles; the paper acknowledges this grant; this researcher is on the grant participant list in this grant role.

The query is structured and unambiguous. Pre-2025, it required string matching grant numbers and best-effort author-name disambiguation. Post-2025, the entire chain runs on PIDs. The CRediT adoption ledger at CASRAI tracks which publishers deposit CRediT to Crossref in the form that makes this work, and which still drop the qualifiers at the deposit step.

What integrators need to do

For publishers depositing content with Crossref, the 2025 GLS recommendations are: include grant DOIs in the funding section of every deposit where a grant is acknowledged; resolve and validate the grant DOI before deposit; carry CRediT roles with the degree-of-contribution qualifier in the contributor section. Crossref’s submission schema 5.4 supports all of this; older schema versions do not, and a number of publishers are still on 4.x.

For institutional CRIS systems, the recommendation is to ingest grant DOIs into the funding record alongside the internal grant number, and to use the grant DOI as the join key when reconciling CRIS funding records against ORCID’s funding entries and Crossref’s content metadata. The CASRAI CRIS integration guide has been updated with the GLS ingestion patterns by major CRIS vendor.

For funders not yet depositing to GLS, the question is what to do about historical grants. Crossref’s recommendation is to deposit prospectively (new awards from a chosen start date) and backfill historical grants over time. The funders that have done well at GLS uptake budgeted a small data-engineering effort over 6-12 months to backfill 5-10 years of historical grants from internal records.

The RAiD-Crossref-GLS triangle

An open question for 2026 is the relationship between GLS and RAiD. Both can identify a research project; both can carry participant, institution, funding, and output metadata; both have ISO-standard or de-facto-standard status. The honest answer is that they overlap meaningfully but serve different communities and emphases.

GLS is funder-centric: a grant is the unit; the funder registers it; outputs reference it. RAiD is project-centric: a project can span multiple grants, multiple funders, multiple institutions, with the project itself the persistent unit. For a single-funder, single-project grant they are functionally identical. For a multi-funder collaboration (typical in clinical trials, large astronomy or particle physics projects, EU consortia), RAiD captures the project shape; the individual grants funding it would each be GLS-registered and the RAiD would reference them.

The Crossref-DataCite-ARDC working group has begun work on a formal crosswalk that lets a GLS grant DOI declare a RAiD that it contributes to, and vice versa. This will not collapse the two but will let consumers traverse the graph in either direction.

Consuming the data

The Crossref REST API exposes GLS grants under /works with a type filter of grant; the relationship from a paper to its grant is in the paper’s relation block with relationship type is-funded-by. For ORCID-aware consumers, the ORCID 4.0 funding resource now carries the grant DOI as a primary identifier, with the Funder Registry entry as the funder.

OpenAIRE consumes GLS deposits and exposes the resulting graph in its OpenAIRE Graph, which is the easiest single endpoint to query for the full funder-grant-output structure. For institutional consumers without the bandwidth to consume Crossref directly, OpenAIRE Graph is the recommended starting point.

What’s still rough

Two known limitations. First, the GLS participants structure does not yet carry CRediT roles directly; it carries grant participation roles, which are a coarser categorisation. This is by design — grant participation is not authorship — but it means that the question “which CRediT role did this person play on this grant” can only be answered indirectly, by intersecting grant participation with CRediT roles on the grant’s outputs. We expect this to be cleaned up in a future GLS schema revision.

Second, historical-coverage gaps remain. Pre-2020 grants are almost entirely absent from GLS; 2020-2023 coverage is partial; 2024 onward is increasingly complete. Tools building on the GLS graph need to handle the missing-grant case gracefully.

Related reading
November 19, 2025
Indigenous data governance: CARE Principles in practice
The CARE Principles for Indigenous Data Governance were published in 2019 by the Global Indigenous Data Alliance (GIDA), expressing four principles – Collective benefit, Authority to control, Responsibility, Ethics – designed to sit alongside the FAIR principles when research data involves Indigenous Peoples, communities, lands, or knowledge. This post offers an introductory map of the CARE landscape in 2026, the relationships among the regional Indigenous data sovereignty movements that informed it, and the operational artefacts that researchers and institutions are using to apply CARE in practice. We write as outsiders to these traditions and rely on the published statements of Indigenous-led organisations; what follows is descriptive, not prescriptive, and any institution implementing CARE should engage directly with the communities whose data is in question.

The CARE Principles

The CARE Principles, drafted by Stephanie Russo Carroll, Maui Hudson, Tahu Kukutai, and colleagues through the GIDA, articulate that data governance is not only a question of technical FAIR-ness but of who has authority over data, who benefits, and what ethical commitments are owed. The four pillars are:
- Collective benefit. Data ecosystems should be designed and function in ways that enable Indigenous Peoples to derive benefit from the data. Inclusive development and innovation; improved governance and citizen engagement; equitable outcomes.
- Authority to control. Indigenous Peoples’ rights and interests in Indigenous data must be recognised and their authority to control such data must be empowered. Recognising rights and interests; data for governance; governance of data.
- Responsibility. Those working with Indigenous data have a responsibility to share how those data are used. Capability for Indigenous communities; positive relationships; appropriate care for data.
- Ethics. Indigenous Peoples’ rights and wellbeing should be the primary concern at all stages of the data life cycle. Minimising harm and maximising benefit; justice; future use.
The principles are deliberately at the level of governance commitments, not the level of technical implementation. Their operationalisation depends on engagement with specific communities and their own governance institutions.

The regional movements that preceded CARE

CARE did not emerge in isolation. It is the international synthesis of regional Indigenous data-sovereignty movements that had been building governance frameworks for years.

OCAP® Principles (Canada)

The OCAP® Principles – Ownership, Control, Access, Possession – were articulated in 1998 by the First Nations Information Governance Centre (FNIGC) and have governed First Nations data in Canada since. OCAP is a registered trademark of FNIGC; the principles assert that First Nations have collective ownership of their information, control over how it is collected and used, access to it, and physical possession of it. FNIGC operates training programmes that researchers working with First Nations data are expected to complete; multiple Canadian Tri-Agency Indigenous research policies reference OCAP explicitly.

Te Mana Raraunga (Aotearoa New Zealand)

Te Mana Raraunga is the Māori Data Sovereignty Network, established 2015. Te Mana Raraunga articulates Māori data sovereignty rooted in tino rangatiratanga (self-determination) under Te Tiriti o Waitangi. The Network’s foundational statements include the 2018 Principles of Māori Data Sovereignty, which were among the documents informing CARE. The relationship between Te Mana Raraunga’s Māori-specific frame and the international CARE frame is one of mutual recognition; Te Mana Raraunga operates with the authority of Māori governance, not as an instance of an international standard.

Maiam nayri Wingara (Australia)

Maiam nayri Wingara, the Aboriginal and Torres Strait Islander Data Sovereignty Collective, was established in 2017 and articulated principles of Indigenous data sovereignty for Australia in 2018. The collective’s work emphasises the rights of Aboriginal and Torres Strait Islander peoples to control data about their people, communities, lands, and waters. The Australian Indigenous Health-Welfare Data Working Group and several federal agencies’ Indigenous data policies reference Maiam nayri Wingara’s frame.

Other regional movements

Indigenous data sovereignty movements with their own governance frameworks operate in many other contexts, including Sámi Council work in Sapmi, Native American data sovereignty organising in the United States (the United South and Eastern Tribes Tribal Health Program and others), Indigenous Latin American collectives, and others. The CARE Principles refer to and respect this plurality; they are not a substitute for any of these regional frameworks but a complement at international scale.

How CARE relates to FAIR

CARE and FAIR are designed to coexist. FAIR addresses technical interoperability and data reusability; CARE addresses governance authority and ethical commitments. A dataset can be both FAIR and CARE-compliant; a dataset can also be FAIR while failing CARE (technically open data that violates community authority); a dataset can be CARE-compliant while not openly FAIR (community-controlled data with restricted access in line with community decision).

The GIDA’s published positioning is that CARE precedes FAIR when Indigenous data is involved: the questions of authority, benefit, responsibility, and ethics must be settled before the questions of findability, accessibility, interoperability, and reusability are operationalised. A FAIR-without-CARE approach to Indigenous data has historically reproduced harm; CARE asks researchers and institutions to do the governance work first.

Free, Prior, and Informed Consent

Free, Prior, and Informed Consent (FPIC) is the international human-rights principle, articulated in the UN Declaration on the Rights of Indigenous Peoples (UNDRIP, 2007) and widely adopted, that Indigenous Peoples must be consulted and consent obtained before any project affecting them, their lands, or their resources proceeds. FPIC applies to research projects involving Indigenous communities, knowledge, or data. The four elements – free (without coercion), prior (sufficiently in advance), informed (with adequate information), consent (with a community decision-making process) – are all substantive.

FPIC operationalisation depends on the community in question. Some communities have formal protocols and Indigenous Research Ethics committees; others negotiate consent through community-leader engagement; others may decline participation. In all cases the timing of the consent process matters: FPIC sought after a project has been designed is generally not FPIC; FPIC must precede project design or at minimum precede any irreversible step.

Traditional Knowledge Labels and Local Contexts

Traditional Knowledge (TK) Labels and Biocultural (BC) Labels, developed by the Local Contexts initiative led by Jane Anderson and Kim Christen, are metadata labels that can be attached to datasets, archival records, or collection items to communicate community-defined permissions, attribution requirements, and cultural protocols. TK Labels include labels for attribution, non-commercial use, outreach, family or clan use, ceremonial use, and others; BC Labels cover biocultural specimens and data with similar granularity.

The labels are not legal instruments by themselves; they are governance signals issued by communities that researchers and institutions are expected to respect. Several repositories (notably the Mukurtu CMS platform, also developed by Christen and colleagues) integrate TK and BC Labels natively. By 2026 several major museums, archives, and a small but growing number of institutional research repositories support TK Labels at the record level.

Practical implementation for institutions

An institution beginning to operationalise CARE alongside its FAIR practice would, in the broadest terms, attend to:
1. Recognising the priority of community authority over data concerning Indigenous peoples, lands, and knowledge, and reflecting this in institutional research-data policy.
2. Engaging with communities through their own governance institutions early, with FPIC understood as a substantive process not a checkbox.
3. Adopting the relevant regional principles where applicable (OCAP in Canada, Te Mana Raraunga principles in Aotearoa, Maiam nayri Wingara in Australia, etc.) rather than treating CARE as a substitute.
4. Supporting researchers in their institution with training, ethics-board capacity, and community-engagement resources; not pushing the burden onto Indigenous researchers within the institution.
5. Implementing technical support for community-defined permissions (TK Labels, access-control models that respect community decision) in institutional repositories.
6. Reporting transparently to communities about how data is used, with channels for community-initiated change to data status.
Several institutional CRIS and repository vendors have begun adding CARE-aware functionality (TK Label support, community-attribution fields, access-control models that respect community-defined permissions). The CASRAI Indigenous data CARE domain tracks adoption.

The integrity question

The honest position for non-Indigenous researchers and institutions is that operationalising CARE well requires deferring to Indigenous-led governance, not designing one’s own “CARE-compliant” system. The literature is consistent on this point: the CARE Principles were developed by Indigenous-led organisations and their authoritative interpretation rests with those organisations and the communities they serve. The CARE Principles are not a checklist that an external institution can mark itself against and self-certify on.

The implication for institutions and researchers is that the CARE work is relational and ongoing rather than one-time and administrative. The investment is in long-term partnerships with communities, capacity-building within Indigenous research leadership, and a willingness to share authority over how data flows into and out of institutional systems. The technical artefacts (TK Labels, FPIC processes, Mukurtu integrations) support the relational work; they do not substitute for it.

Where to learn more

For non-Indigenous researchers and institutions beginning this work, the foundational reading is the GIDA’s published statement of the CARE Principles, alongside the regional movements’ own foundational documents (FNIGC on OCAP, Te Mana Raraunga on Māori data sovereignty, Maiam nayri Wingara on Aboriginal and Torres Strait Islander data sovereignty). The Carroll, Hudson, Kukutai, et al. 2020 paper in Data Science Journal is the foundational scholarly reference for CARE. The Local Contexts initiative’s documentation is the foundational reference for TK and BC Labels. The Mukurtu CMS documentation is the foundational technical reference for community-controlled repository implementation.

Related dictionary entries
References

Carroll, Hudson, Kukutai, et al., The CARE Principles for Indigenous Data Governance (Data Science Journal, 2020). GIDA, CARE Principles for Indigenous Data Governance (founding statement, 2019). First Nations Information Governance Centre, The First Nations Principles of OCAP® (foundational and ongoing publications). Te Mana Raraunga, Principles of Māori Data Sovereignty (2018). Maiam nayri Wingara, Indigenous Data Sovereignty Communique (2018). UN Declaration on the Rights of Indigenous Peoples (2007). Anderson and Christen, work on Traditional Knowledge Labels and the Local Contexts initiative (ongoing).
November 12, 2025
ORCID 4.0: the IDR roadmap and what it means for CASRAI integrations
ORCID’s Integration and Data Roadmap (IDR) work, which culminated in late 2025 with the 4.0 release of the public and member APIs, is the most consequential PID infrastructure change of the year for anyone who cares about the contributor-affiliation-funding crosswalk. The headline is technical: a new contributions resource that supersedes the old works and employment pairing for representing what a researcher did, where, on whose money, and with whom. The implications reach into nearly every persistent-identifier integration CASRAI tracks.

What 4.0 actually changes

The pre-4.0 ORCID record was a federation of resource types: works (with DOIs), employment (with ROR organisation IDs), education, funding (with grant IDs and Funder Registry entries), peer reviews, and the like. Each was useful in isolation. None of them carried the relations between them in a structured form. If a researcher’s ORCID record listed a paper, an employment at the institution that hosted the work, and a grant that funded the work, those three facts sat in separate resources with no machine-readable link.

4.0 introduces a top-level contribution entity that binds these. A contribution carries: a primary artefact (DOI, software identifier, dataset identifier, or RAiD), a set of CRediT roles with the degree-of-contribution qualifier, an affiliation in force at the time of the contribution (with ROR), funding in force at the time (with Funder Registry or ROR for the funder, plus the grant identifier and ideally a RAiD), and a temporal span. The relationships are explicit and queryable. A consuming system can ask: what did this researcher contribute, at this affiliation, under this grant, on this date? — and get an answer without inference.

The CRediT-at-record-level integration matures

The 2024 work to allow CRediT roles to live on an ORCID record (not just in publisher JATS) was the precursor to 4.0. The integration shipped, was widely adopted, and exposed two limitations that 4.0 closes. First, role assignments lived inside the work resource, making it awkward to express a Conceptualization role spanning several papers and datasets. Second, the qualifier was carried only at per-work granularity. 4.0 lets a CRediT role attach to a contribution that groups multiple artefacts, with the qualifier traveling with the contribution.

Practical example: a researcher who is Lead for Conceptualization across a clinical trial’s primary paper, protocol paper, registered data, and statistical analysis plan should be representable that way. Pre-4.0, the assertion had to be repeated four times; post-4.0, it lives on the contribution entity. See the ORCID implementation guide for the API patterns.

RAiD becomes a first-class citizen

One of the unsung wins in 4.0 is the elevation of RAiD to a first-class identifier alongside DOI. Pre-4.0, RAiD could be carried in an ORCID funding resource as an external identifier, but the schema treated it as a second-tier metadata field. 4.0 adds RAiD to the primary identifier set for both contributions and funding, with the same validation and resolution support as DOI.

This matters because RAiD is increasingly the canonical project-level identifier, and ORCID is increasingly the canonical person-level record. The interlock — researcher X contributed to project RAiD Y, which produced papers A, B, C — is now a structured query rather than a string-match exercise.

Affiliation history with PIDs at both ends

The 4.0 employment and affiliation model has been quietly tightened. Every affiliation now requires a ROR organisational ID at registration; legacy string-only affiliations are preserved but flagged. The optional department field accepts a ROR sub-organisation ID where one exists (the ROR hierarchy work has caught up to make this practical), or a free-text department name as a fallback. The result is that affiliation history on an ORCID record is now reliably machine-readable at the ROR ID level.

For institutions running a CRIS, this closes a longstanding crosswalk gap. CRIS-to-ORCID deposit can now write structured affiliations that ORCID-to-CRIS retrieval can read back without ambiguity. The CASRAI CRIS integration guide has been updated with the 4.0 deposit patterns.

What CASRAI integrations need to do

Three things, in priority order.
1. Update CRediT JATS round-trips. Publishers depositing structured CRediT to ORCID via the member API should switch to the contribution resource for new deposits. Legacy works-with-roles deposits will continue to be accepted through 2026 but will be migrated server-side in 2027. The CASRAI CRediT JATS integration patterns now include both the legacy and the 4.0 deposit forms; new integrators should implement only the 4.0 form.
2. Validate ROR IDs at affiliation deposit. A CRIS or publisher pushing affiliation data to ORCID should resolve and validate the ROR ID before deposit. The 4.0 API will reject obviously bad ROR IDs at the schema layer but will accept ROR IDs that resolve to deprecated or merged records. A pre-deposit validation pass against the ROR public API catches the common error cases.
3. Test the funding-to-contribution link. If your integration writes funding entries, link them explicitly to the contributions they funded via the new funded_by relation on the contribution resource. This is the integration point that was missing pre-4.0 and that downstream consumers (funder dashboards, institutional reporting) most want to query.
Backwards compatibility and the migration window

ORCID’s commitment is that the 3.x APIs remain available through end-of-2027, with the 4.0 API the recommended target from now. The data model migration is largely automatic for existing records: pre-existing works with associated employment and funding are projected into the contribution model server-side. Consumers reading via the 4.0 API will see contribution entities even for data that was deposited in the 3.x form.

The one wrinkle is CRediT role assignments that were deposited in 3.x without explicit qualifiers. These project into the contribution model with no qualifier set — a valid state, but less informative than it could be. Publishers should re-deposit historical CRediT data with qualifiers where they have them during 2026.

What this enables downstream

The most interesting consequence of 4.0 is the ability to ask compound questions across the PID graph. Which researchers, affiliated with which institutions, contributed in which CRediT roles, to outputs funded by which funders, on which projects? — that query reduces to a structured traversal across ORCID, ROR, Crossref/DataCite, and the Funder Registry, with RAiD optionally tying the project layer together. The OpenAIRE Graph already operationalises a version of this; 4.0 makes it cleaner.

For institutions, the practical implication is that reporting against funder mandates becomes substantially less manual. For publishers, the JATS-to-ORCID deposit becomes more valuable because it now persists in a queryable graph. For funders, the funder-PID-to-output traceability that ORCID has long promised starts to deliver at scale.

What’s still missing

4.0 does not solve everything. The contributor-affiliation-funding triple is now structured; the contributor-contributor relationship (collaboration graphs, mentorship) is not. A relationships resource is in development but not in 4.0. CARE-aligned identifiers for Indigenous researchers are also still in design.

CASRAI’s integration tracking will follow 4.0 through 2026. The persistent-identifiers domain is being updated to reflect the contribution model; the ORCID federation page tracks member implementation.

Related dictionary entries
November 5, 2025