Research information lives in many systems. Institutional repositories hold open-access copies of papers, theses and datasets; current research information systems (CRIS) track projects, grants, outputs and people; disciplinary repositories, funder databases and discovery services each hold their own slice of the picture. None of these is useful in isolation. The value of a repository depends on its contents being found, harvested and reused elsewhere; the value of a CRIS depends on its being connected to the systems that hold the underlying records. What turns isolated systems into a connected research-information ecosystem is a set of interoperability protocols and standards — the agreed ways systems exchange metadata, deposits and notifications. This article surveys the most important of them, drawing on the research information systems domain of the CASRAI Dictionary.
OAI-PMH: the foundation of metadata harvesting
The oldest and most widely deployed of these is the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Its purpose is to let one system — a harvester or aggregator — collect metadata records in bulk from another, such as a repository. A repository exposes its metadata through an OAI-PMH interface, and aggregators and other repositories can periodically harvest those records, building combined indexes that span many sources. OAI-PMH is the protocol that historically made cross-repository discovery possible: it is how aggregators assembled large indexes of open-access content from thousands of repositories, and it remains a backbone of the metadata-sharing landscape. It is a harvesting protocol — it moves descriptive metadata, not the full content — and its long dominance reflects how much value flows from gathering descriptions of outputs into one place.
SWORD: depositing across systems
Where OAI-PMH harvests metadata out of a system, SWORD (Simple Web-service Offering Repository Deposit) handles the reverse direction: depositing content into a repository from elsewhere. SWORD provides a standard way for one system to deposit an item — a paper, a dataset, the accompanying metadata — into a repository programmatically, without a person manually uploading it through a web form. This matters for workflow integration. It allows, for example, a publishing platform or a CRIS to deposit an output directly into an institutional repository, or a researcher’s tools to push a deposit automatically. By standardising deposit, SWORD reduces the friction and re-keying that would otherwise accompany getting content into repositories, helping outputs flow to where they need to be with less manual effort.
ResourceSync: keeping systems in step
OAI-PMH was designed in an earlier era and is oriented towards periodic batch harvesting of metadata. ResourceSync is a more modern standard addressing a related but broader need: synchronising resources — and changes to them — between systems efficiently and in a timely way. Built on widely used web conventions, ResourceSync lets a destination system learn what a source holds, fetch it, and crucially stay up to date as the source changes, rather than re-harvesting everything repeatedly. It can handle the resources themselves, not only their metadata, complementing the established harvesting model with one better suited to keeping large, changing collections in step.
COAR Notify: linking the distributed network
A more recent and conceptually important development is COAR Notify, an initiative of the Confederation of Open Access Repositories. It addresses a need the older protocols did not: allowing systems to notify one another about events and relationships in a standard, decentralised way. The motivating use cases include linking repository resources to overlay services such as peer review and endorsement — for instance, a repository preprint being notified that it has been reviewed by an external service, and recording that link. COAR Notify uses standard web notification patterns to let distributed systems pass these messages, weaving repositories into a connected network rather than leaving them as isolated stores. It reflects a shift in thinking: from repositories as passive archives to repositories as active participants in a distributed scholarly-communication system, exchanging not just records but signals about what is happening to those records.
CERIF: a shared model for research information
Protocols move information between systems, but the systems also need a shared model of what that information means. This is the role of CERIF (the Common European Research Information Format), a standard data model for research information — describing the entities of the research world (projects, people, organisations, outputs, funding) and, importantly, the relationships between them. CERIF gives CRIS platforms a common structure, so that research information can be exchanged between them without each system inventing its own incompatible representation. Where OAI-PMH, SWORD, ResourceSync and COAR Notify are about moving information, CERIF is about agreeing what the information is — the conceptual interoperability that makes the technical exchange meaningful.
Why protocols are not enough on their own
These protocols and standards solve the mechanics of exchange, but moving metadata between systems achieves little if the systems disagree about what the metadata means. A protocol can faithfully transfer a record describing an output or a funding link, yet if the receiving system interprets those fields differently from the sender, the connection is hollow. Genuine interoperability requires agreement on the content of the metadata as well as the means of transport — a deeper challenge we explore further in our work on comparing standards.
A consistent vocabulary beneath the protocols
This is why a shared vocabulary sits beneath all of these protocols. For metadata harvested, deposited, synchronised or notified between systems to be understood correctly, the elements it contains — output types, relationship types, contributor roles, funding information — must mean the same thing everywhere. That consistency is what the CASRAI Dictionary provides: a shared vocabulary so that the information flowing through OAI-PMH, SWORD, ResourceSync and COAR Notify is understood identically wherever it lands. And because the contributions behind every output moving through these systems are part of the record, they can be described in the same shared framework — the CRediT taxonomy and its full set of contribution roles. The protocols are the plumbing; the shared vocabulary is what makes the water that flows through them drinkable everywhere it arrives.