Tag: funding finance

  • Funding acknowledgements and grant identifiers: closing the loop on research funding

    Almost every research paper carries a sentence of thanks to its funders — “this work was supported by” followed by an agency name and, if you are lucky, a grant number. It takes seconds to write and, in its usual free-text form, it is almost useless as data. A funder trying to answer the simple question “what did our money produce?” finds the answer scattered across thousands of inconsistently worded acknowledgements that no system can reliably aggregate. Closing that loop — connecting the grant that paid for the work to the outputs the work produced — is the problem this article is about, and it belongs to the funding-and-finance domain. For authors, the practical starting point is the guidance on acknowledging funders.

    The free-text problem

    The same funder is written a hundred ways across a corpus: full legal name on one paper, an acronym on the next, a translated form, a former name, a sub-programme mistaken for the parent body, a typo in the grant number. To a human reader the meaning is obvious; to a system trying to count outputs per funder or per grant, every variant is a different string that fails to match. The consequences are concrete. Funders cannot easily demonstrate return on investment, evaluate which schemes produced the most influential work, or check that the open-access and reporting conditions attached to a grant were met. Institutions cannot reconcile what was acknowledged against what was awarded. The information exists on the page; it simply is not in a form anyone can compute with.

    The Open Funder Registry: identifying the funder

    The first half of the fix is to give every funder a single, stable identifier. The Open Funder Registry (originally FundRef, now maintained as part of Crossref’s infrastructure) is an open, curated list of funding bodies, each with a unique funder ID and a controlled record of its name variants, acronyms, and hierarchical relationships to parent and child organisations. When a publisher records a funder ID against an acknowledgement rather than only a free-text name, every variant of “National Institute” collapses onto one entity. The registry does for funders what other registries do for institutions: it replaces a messy display string with a resolvable identifier that carries the meaning.

    The registry also cross-walks to organisation identifiers — many funders are also research organisations, and the alignment between funder IDs and the Research Organization Registry (ROR) lets a body be recognised consistently whether it is being named as a funder or as an affiliation. That cross-walk matters because it stops the funder-data silo and the organisation-data silo from telling different stories about the same institution.

    Crossref grant linking: identifying the grant

    Naming the funder is only half the loop. The more valuable connection is to the specific grant, and that is what Crossref grant linking provides. Crossref operates a grant-registration system in which funders register their awards and receive a grant identifier — a DOI for the grant itself. The grant record carries structured metadata: the funder, the award number, the title, the investigators (ideally with their ORCID iDs), the funded institutions (ideally with ROR IDs), and the award amount and period.

    Once a grant has its own persistent identifier, an output can cite it the way it cites anything else. A published article’s metadata can carry the grant DOI, creating a machine-readable link from the paper back to the award that paid for it. That single link is what closes the loop: instead of inferring the connection from a fragile name-and-number string, a system can follow an identifier from grant to output and back. Funders can then assemble, automatically, the full set of outputs associated with an award — papers, datasets, software, preprints — rather than reconstructing it by hand from acknowledgements.

    How the pieces fit with the wider identifier stack

    Grant linking is most powerful in combination with the other persistent identifiers, because each answers a different question about a funded piece of work:

    • The funder ID answers who paid — the funding body.
    • The grant ID answers under which award — the specific grant.
    • ORCID answers who did the work — the funded researchers.
    • ROR answers where — the institutions that held the award.
    • The output’s DOI answers what was produced.

    Linked together, these turn a pile of disconnected records into a navigable funding graph: this funder, through this grant, supported these researchers at these institutions to produce these outputs. The graph is what makes funder reporting tractable, and its absence is exactly why “outputs by grant” has historically been so painful to compute.

    What authors and institutions can do

    1. Record the funder ID and the grant identifier, not just the name. When a submission system offers to attach a registered funder from the Open Funder Registry, or to record a grant DOI, accept it — that is the step that makes the acknowledgement countable.
    2. Quote the award number exactly as the funder issued it, so that even where a grant DOI is not yet available, the number can be matched reliably.
    3. Attach ORCID iDs and ROR IDs to investigators and institutions in grant and output metadata, so the funding graph connects cleanly at every node.
    4. Treat the acknowledgement as structured data, not prose. A sentence of thanks is a courtesy; the identifiers behind it are what let a funder see what its money produced.

    Crediting the people the funding supported

    Funding metadata records what paid for the work; it does not record who did it. The CRediT taxonomy includes a dedicated Funding acquisition role — the work of securing the financial support that made the project possible — which lets the often-invisible labour of winning a grant be recorded on the resulting paper alongside the other contributions. Grant identifiers connect the award to the output; CRediT connects the people to the work the award funded. Together they ensure that both the money and the human contribution are visible in the record.

    Where shared vocabulary fits

    “Funder”, “grant”, “award”, “acknowledgement”, and “funding statement” are used inconsistently across publishers, funders, and institutions, which is part of why funding data is so hard to reconcile. A shared, federated vocabulary that defines these terms precisely — and points back to the Open Funder Registry and Crossref’s grant-linking schema — is what lets a funding acknowledgement written in one system be understood in another. Supplying that definitional layer is the role the CASRAI dictionary is designed to play; the relevant terms sit in the funding-and-finance domain.

    Related reading

  • The grant lifecycle as structured data: from call to closeout

    It is tempting to think of a grant as a single event: the moment an application is funded. In practice a grant is a lifecycle — a sequence of stages that begins long before the award and continues well after the last experiment, and each stage produces data that some system, somewhere, has to record. When that data is structured and carries persistent identifiers, the same fact can be entered once and read everywhere. When it is trapped in PDFs and free text, it is re-keyed at every transition, which is precisely the administrative burden CASRAI was founded to reduce. This article walks the grant lifecycle as a data object, drawing on the vocabulary of the funding and finance domain.

    Pre-award: the call and the proposal

    The lifecycle opens with a funder publishing an opportunity — a call for proposals (CFP) in UK and European usage, or a Funding Opportunity Announcement (FOA) in the US idiom. The call is itself structured data: it has a sponsoring funder (identifiable by a Crossref Funder Registry ID), a programme and a scheme, eligibility rules, a budget ceiling, and deadlines. A call expressed as machine-readable metadata can be matched automatically against researcher profiles in a CRIS, rather than circulated as an email attachment.

    The proposal that responds to it carries the costings, and this is where financial vocabulary needs to be precise. A budget separates direct costs — those attributable to the project — from indirect costs (overhead): the costs of supporting research that cannot be tied to one project. Different regimes calculate the split differently. The UK uses full economic costing (fEC); the US uses modified total direct cost (MTDC) as the base for its negotiated indirect-cost rate. Recording which convention a budget follows is not pedantry — it is the difference between two figures that look comparable and are not.

    Award: the notice and what it sets in train

    When a proposal succeeds, the funder issues an award notice: the formal communication that establishes the grant. This is the moment a great deal of metadata should crystallise. The award has a start and end date, a value, a set of reporting obligations, and — increasingly — a Crossref grant ID, a persistent identifier minted for the individual award through the Crossref grants schema. That grant ID is what lets a later publication, dataset, or piece of software cite the funding that produced it, closing the loop between money and output.

    The award is also the point at which a project record proper should come into being. Where the funder or institution uses a Research Activity Identifier (RAiD), the awarded grant is one of the funding entries that the project’s RAiD record references. The grant funds the project; the project is the RAiD; the outputs cite the grant ID; and the people carry ORCID iDs and the institutions ROR IDs. Established at award, those connections travel through the rest of the lifecycle automatically.

    Post-award: the part everyone underestimates

    Most of a grant’s life is post-award, and most of its administrative friction lives here too. Funds are spent against budget lines, and reality rarely matches the plan exactly. A carry-forward moves unspent funds across budget periods; an underspend is what remains at a period’s end; an overspend is expenditure beyond the awarded budget. Each is a structured fact a funder report needs, and each is far easier to assemble from a system that tracked it as data than from a spreadsheet reconstructed at deadline.

    The single most common post-award event is the no-cost extension (NCE): an extension of the grant period without additional funds, granted when the work needs longer than planned. An NCE changes the project’s end date — which changes when reports are due, when closeout begins, and when the project record should flip status. If the end date lives only in an approval email, every downstream system drifts out of sync. If it lives in the project’s structured metadata, the NCE updates one field and the dependent dates recompute.

    Throughout the post-award phase, funders impose reporting requirements: interim and final reports, financial statements, output listings. A report assembled from structured grant data — outputs already linked by DOI to the grant ID, expenditure already categorised — is a query, not a transcription exercise.

    Closeout: the stage that is mostly data hygiene

    The lifecycle ends with closeout: the final administrative phase after the project’s end date. Closeout typically requires a final financial report reconciling expenditure against the award, a final outputs report, and confirmation that data-management and open-access obligations have been met. It is the stage where structured data pays off most, because closeout is almost entirely a matter of pulling together facts that should already exist as records.

    A well-run closeout is where the connections established at award prove their worth. If every output carries the grant ID, the final outputs report writes itself. If the data-management plan was machine-actionable and updated through the project, confirming that the realised datasets were deposited is a check, not an investigation. If the project’s RAiD aggregates its outputs, people, and funding, the closeout report is a view over an existing graph.

    Why structure the lifecycle at all

    The argument for treating the grant lifecycle as structured data is the same argument that runs through all of CASRAI’s work. The same facts — funder, award value, dates, outputs, expenditure categories — are entered repeatedly into incompatible systems because no shared, identifier-anchored representation exists. A controlled vocabulary for the lifecycle’s stages and financial concepts, federated to the funder taxonomies that already exist, is the precondition for entering each fact once. The pieces are increasingly in place: the Crossref Funder Registry for funders, Crossref grant IDs for awards, RAiD for projects, ORCID and ROR for people and institutions. What is missing is the shared definitional layer that ties them to a common lifecycle model — which is exactly the role the CASRAI dictionary is built to play.

    What to do now

    For research offices and CRIS owners: capture the lifecycle as structured stages — call, proposal, award, post-award events, closeout — with the award’s Crossref grant ID as the spine, rather than as a folder of documents. For funders: mint grant IDs at award and require them on outputs, so that closeout reporting becomes a query. For standards work: prioritise a shared vocabulary for the financial concepts that differ by jurisdiction (fEC, MTDC, indirect-cost base) so that comparable-looking figures are genuinely comparable.

    Related reading