A CRIS that could not exchange data with anything else would be of limited use. Research information has to move: between an institution’s CRIS and its repository, between the CRIS and funders, between institutions, and into national and European aggregators. Making that movement work is the problem of interoperability, and the standard built expressly for it is CERIF. This article explains what CERIF is, how research-information interoperability actually works, and where it still breaks — drawing on the research-information systems domain.
What CERIF is
CERIF — the Common European Research Information Format — is a data model for representing research information in a way that different systems can exchange. It originated in the European Union and is now maintained by euroCRIS, the international association for research-information professionals that also stewards the former CASRAI Catalogue of Elements. CERIF is not a product and not a database you install; it is a model — a specification of entities, their attributes, and the relationships between them — that vendors and integrators can map their systems onto.
CERIF’s defining design choice is that it puts relationships at the centre. Rather than treating a publication as a record with an author field, CERIF models the person, the publication, and the linkage between them as a first-class, time-bounded relationship. The same applies to a person’s affiliation to an organisation, a project’s funding by a funder, a publication’s output from a project. This relationship-centric design is what lets CERIF represent the messy realities of research — people who change institutions, projects with multiple funders, outputs that belong to several projects — without losing information.
RIM is the function CERIF serves
It is worth restating the distinction from the previous article in this series. Research Information Management (RIM) is the practice; the CRIS is the system; CERIF is the interoperability standard that lets RIM systems exchange information. A research office practising RIM well wants its CRIS to speak a common language to the wider ecosystem, and CERIF is the most established candidate for that common language in the European context. The value proposition is straightforward: if two systems both map to CERIF, they can exchange research information without bespoke point-to-point integration.
How interoperability actually works in practice
CERIF is not the only standard in play, and real-world interoperability is a layered affair. Different exchange problems use different standards, and a mature institution speaks several.
- For exchanging publication metadata with the open-access ecosystem, the OpenAIRE Guidelines specify how a repository should expose its metadata so that OpenAIRE — the European open-science aggregator — can harvest it consistently. A repository that follows the guidelines becomes discoverable across the whole network.
- For exchanging the full text and structure of articles, JATS XML (the Journal Article Tag Suite) is the NISO standard that publishers and repositories use to represent article content in a machine-processable form.
- For making research information legible to the open web and search engines, Schema.org types such as Person and ScholarlyArticle express the same entities in the vocabulary that general-purpose web crawlers understand.
- For CRIS-to-CRIS exchange and feeds to national systems, CERIF provides the shared model.
Underneath all of these sit the persistent identifiers. CERIF, OpenAIRE, JATS, and Schema.org all become far more powerful when the entities they describe carry ORCID iDs, ROR IDs, and DOIs, because the identifiers let a record exported from one system be matched unambiguously to the corresponding record in another. Interoperability is not one standard; it is a stack, and identifiers are the layer that ties the stack together.
Where interoperability still breaks
Despite a mature standard and a maturing identifier ecosystem, research-information exchange still fails in predictable places, and it is worth being honest about them.
The first failure point is local extension. CERIF is deliberately flexible, and that flexibility is double-edged: two systems can both “support CERIF” while populating its extension points so differently that exchange between them still requires custom mapping. A standard that everyone implements slightly differently is only partly a standard.
The second is vocabulary mismatch below the entity level. CERIF tells you how to model that a person has a relationship to an output; it does not, on its own, tell you what the controlled list of output types or project phases or ethics statuses should be. Two CERIF-compliant systems can disagree completely on whether a preprint is its own output type, and the exchange will technically succeed while silently losing meaning. This is the gap that a shared definitional vocabulary fills.
The third is identifier coverage. Interoperability degrades exactly where identifiers are missing: a researcher without an ORCID, an organisation without a ROR ID, an output without a DOI. The exchange falls back to string matching, and string matching fails on names.
Where the dictionary fits
CERIF is a structural standard: it defines the shape of the data. A shared dictionary is a definitional standard: it defines what the values mean. The two are complementary, not competing — the dictionary’s own design recognises this, treating itself as definitional where the Catalogue and CERIF are structural. The places where CERIF interoperability breaks below the entity level — output-type lists, status vocabularies, lifecycle stages — are precisely where a federated, operational vocabulary adds the missing layer. Supplying it, while pointing back to euroCRIS as the steward of the structural model, is the integrative role the CASRAI dictionary is designed for.
What to do now
For institutions: treat interoperability as a stack — CERIF for CRIS exchange, OpenAIRE Guidelines for repository harvesting, JATS for article structure, Schema.org for the open web — and invest in the identifiers that hold it together. For standards work: pair the structural model with shared, operational vocabularies for the value lists that CERIF leaves open, federating to euroCRIS for the model itself.
Leave a Reply