Aligning research outputs with the UN Sustainable Development Goals: metadata that travels

Open a recent institutional repository, a funder’s portfolio dashboard, or a publisher’s article page, and you will increasingly find research tagged against the United Nations Sustainable Development Goals — the 17 goals, adopted by all UN member states in 2015, that frame global priorities from ending poverty to climate action. SDG tagging has become one of the most common ways to express that a piece of research is relevant to a global challenge. It is also one of the least standardised, and understanding both halves of that statement is essential to using it well. This article examines how SDG alignment metadata works, drawing on the engagement, impact and SDG domain.

What SDG alignment metadata is

At its simplest, SDG alignment is a tag: an assertion that a research output contributes to, or is relevant to, one or more of the 17 goals. SDG 3 is Good Health and Well-being; SDG 7 is Affordable and Clean Energy; SDG 13 is Climate Action; and so on through the full set, each with its own targets and indicators beneath it. A paper on malaria vaccines might be tagged SDG 3; a paper on solar-cell efficiency, SDG 7; a paper that crosses domains, several at once.

The value of expressing this as metadata — rather than as prose buried in an abstract — is that it travels. A structured SDG tag attached to an output flows from the publisher into the aggregators, into institutional CRIS systems, into funder dashboards, and into the analytics products that roll research up by goal. An institution can answer “how much of our output contributes to climate action?” only if the alignment is structured; the same question asked of free text requires a project to answer.

Where the tags come from, and why it matters

The central problem with SDG alignment is that there is no single authoritative method for assigning the tags, and the different methods disagree. There are three broad sources.

  • Author-asserted. The researcher selects the SDGs their work addresses, usually at submission. This captures intent and disciplinary nuance but is inconsistent across authors and prone to optimistic over-tagging.
  • Algorithmically classified. Several large bibliometric platforms assign SDG tags by running classifiers — typically keyword queries or machine-learning models — over titles, abstracts, and keywords. These scale to millions of outputs but are only as good as their query definitions, and different vendors’ classifiers produce materially different mappings for the same paper. A well-known consequence is that an output’s SDG profile can change simply by switching data provider.
  • Curated taxonomies. Some publishers and repositories use editorially maintained mappings, which are more consistent but more expensive to maintain.

The practical implication is that SDG alignment metadata is genuinely useful for discovery and broad portfolio framing and genuinely unreliable for precise measurement or comparison across sources. Treating a vendor’s SDG counts as a precise metric — let alone as a basis for ranking — mistakes a fuzzy, method-dependent tag for a measurement. The honest use is directional: SDG tags tell you roughly where a body of work points, not exactly how much of it counts.

The connection to impact assessment

SDG alignment sits within the broader machinery of research impact, and it is most powerful when connected to it rather than treated as a standalone label. The UK’s Research Excellence Framework (REF) assesses impact through narrative impact case studies — structured accounts of how specific research led to specific non-academic benefit. An SDG tag and an impact case study are doing complementary work: the tag is a coarse, machine-readable signal of thematic relevance; the case study is a rich, human-readable account of actual effect. The tag helps you find the relevant work; the case study tells you what difference it made.

The same complementarity runs through the other impact vocabulary in this domain: a pathway to impact describes the intended route from output to benefit; policy uptake, evidenced through records such as those in policy-citation databases, captures realised influence; patient and public involvement records who outside academia shaped the work. SDG alignment is the layer that lets all of this be aggregated thematically — but only if the alignment metadata is structured and its provenance is recorded.

Provenance is the missing field

If there is one recommendation that follows from the method-dependence problem, it is this: SDG alignment metadata should always carry its provenance. A tag that records how it was assigned — author-asserted, classifier (and which one, which version), or curated — is interpretable. A bare tag is not, because the consumer cannot know whether they are looking at an author’s considered judgement or a keyword match. As CASRAI’s broader dictionary work argues, metadata without provenance is metadata you cannot safely act on, and SDG tags are a textbook case. A repository that records “SDG 13, author-asserted” alongside “SDG 7, classifier vX” gives its consumers something they can reason about; one that records only “SDG 13, SDG 7” does not.

Why a shared vocabulary helps

The SDGs themselves are a fixed, authoritative vocabulary — the UN defines the 17 goals and their targets, and that is not in dispute. What is not standardised is the metadata layer around the tag: how alignment is expressed in a record, how provenance is captured, how multiple tags are weighted or ordered, how the tag relates to impact and engagement records. That surrounding vocabulary is exactly the kind of integrative, machine-readable definition work the CASRAI dictionary exists to do — not to redefine the SDGs, but to standardise how alignment to them travels through research-information systems.

What to do now

For researchers and repositories: tag SDG alignment, but always record the provenance of each tag. For institutions and funders: use SDG metadata for thematic discovery and broad portfolio framing, and resist the temptation to treat vendor SDG counts as precise, comparable metrics. For impact assessment: connect SDG tags to the richer impact record — pathways, policy uptake, case studies — rather than letting the tag stand alone. The goal is metadata that travels and stays interpretable when it arrives.

Related reading

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *