Profile
Composite institution profile
| Institution type | Regional consortium of 17 universities across 5 EU member states |
| Size | ~210,000 researchers across the consortium; combined Horizon Europe portfolio of >€800m active grants |
| Country / region | 5 EU member states (composite — small, medium, and large national systems represented) |
| Research areas | Cross-disciplinary; consortium runs joint doctoral programmes and shared infrastructure |
| CRIS / repository | Mixed estate — five universities on Pure, four on DSpace-CRIS, three on Symplectic Elements, two on Converis, three on local-build systems |
The challenge
What problem were they trying to solve?
The consortium had operated for seven years on the principle of "infrastructure together, metadata separate" — every member university ran its own CRIS, reported separately to its national funder, and translated its data into the consortium's shared reporting templates by hand. With Horizon Europe scaling up, joint doctoral programmes had become a major activity, and the manual translation work had hit a breaking point. A consortium-funded analysis estimated that consortium-level reporting was consuming the equivalent of seven full-time research-administrator FTEs across the seventeen universities, almost entirely on tasks that were "the same data in different formats". The consortium board commissioned a metadata-harmonisation programme in 2024 with three deliverables: a shared vocabulary all seventeen universities would adopt, a cross-walk to each national funder's reporting taxonomy, and a multilingual implementation guide (English, French, German, Dutch, and Italian — covering the working languages of the consortium's national systems). The political constraint was strict: no university would be asked to replace its CRIS, and no national funder would be asked to change its reporting categories.
The approach
How they implemented it
The consortium chose the CASRAI Dictionary as the shared vocabulary because it already provided the cross-walk pattern the consortium needed: every Dictionary term has a defined relationship to adjacent standards (DataCite, MARC 21, ORCID, Schema.org), so adding national-funder cross-walks was a manageable extension rather than a from-scratch design. The harmonisation programme ran in two strands. Strand A (vocabulary) appointed a working group of seventeen research-information professionals — one per university — chaired jointly by the consortium's data-architecture lead and a senior librarian from the largest member. The group mapped each national funder's reporting taxonomy onto the CASRAI Dictionary domain structure (research outputs, research data, projects and funding, people and affiliations, research integrity), and explicitly flagged the mismatches. Strand B (translation) worked in parallel: a team of professional translators with research-administration domain knowledge produced the four non-English versions of the Dictionary terms most likely to appear in consortium reporting, using the canonical English term and definition as the source of truth and pinning the translated label as a localised alias rather than a separate concept. The two strands converged in month nine into a unified implementation guide and a JSON-LD-formatted cross-walk file that each university imported into its own CRIS as a controlled vocabulary.
Timeline
Rollout phases
- Months 1–3
Working group formation + scoping
Seventeen universities appointed representatives, governance terms agreed by consortium board. Scope locked to vocabulary harmonisation (not CRIS replacement, not funder-reporting reform).
- Months 4–6
Cross-walks to 5 national funder taxonomies
Strand A produced cross-walks from each national funder's reporting categories to the CASRAI Dictionary. ~80% of categories mapped cleanly; ~15% required composite mappings; ~5% were flagged as genuinely incompatible.
- Months 7–9
Four-language translation
Strand B produced French, German, Dutch, and Italian versions of ~480 high-frequency Dictionary terms. Used CASRAI's canonical English term as the source of truth; localised labels surfaced as aliases.
- Months 10–12
Unified guide + JSON-LD cross-walk file
Strands A and B merged into a 90-page implementation guide and a machine-readable JSON-LD cross-walk file. Both released under CC-BY 4.0 to the wider EU research-administration community.
- Months 13–18
Per-university adoption
Each of the seventeen universities imported the cross-walk into its own CRIS. Three universities completed adoption in three months; the slowest took fifteen.
Outcomes
Illustrative outcomes
Every metric below is illustrative — synthesised from observed patterns across multiple adoption journeys, not attributed to a single real institution.
~7 FTE
estimated reduction in consortium-level reporting overhead by end of year 2
5 → 1
reporting taxonomies a consortium PI now interacts with (in principle)
~480
Dictionary terms translated into French, German, Dutch, Italian
17/17
consortium universities adopted the cross-walk within 18 months
~15%
of national-funder categories required composite mappings (well-bounded)
CC-BY 4.0
implementation guide released to the wider community (and adopted by 4 non-member universities)
Lessons learned
What they would tell the next institution
- 01Federation, not displacement, was the load-bearing principle. The consortium would have failed if it had tried to force a single CRIS choice; succeeded because it harmonised vocabulary while letting each university keep its tools.
- 02The 15% of "incompatible" funder categories were the most valuable finding. Flagging them publicly gave each national funder a clear list of where their reporting taxonomy diverged from the consortium's, which started a separate conversation about funder-side reform.
- 03Translation needed domain experts, not generic translators. Two early translations from a non-specialist agency had to be redone because they translated "principal investigator" with the literal national-context term that no researcher would recognise.
- 04JSON-LD as the cross-walk file format was a strong choice. Every CRIS in the consortium could import it; if the team had picked CSV or Excel, the smaller universities without integration capacity would have stalled.
- 05Per-university adoption pace varied 5x. Plan for the slowest, not the average.
What's next
Planned next steps
The consortium is now extending the cross-walk to cover CRediT contributor roles specifically (the current cross-walk covers research-administration vocabulary but not contribution-level data), and is working with euroCRIS to feed lessons back into the CERIF standard. A second consortium in northern Europe has indicated interest in joining the working group, which would extend the cross-walk to two additional national funder taxonomies.
Q&A with the composite project lead
Composite project-lead Q&A
The questions and answers below are composite — synthesised from interview patterns across multiple real project leads. They are not attributed to a specific real person.
- Why CASRAI Dictionary rather than building from scratch?
- Two reasons. First, the Dictionary already had the cross-walk pattern we needed (mappings to DataCite, MARC 21, ORCID, Schema.org), so adding national-funder mappings was an extension rather than a foundation. Second, CC-BY 4.0 — every member university's legal team could approve adoption without a procurement cycle, which would have added a year.
- How did you handle the 5% of genuinely incompatible categories?
- Honestly. We published the list. For each incompatible category we documented which national funder used it, why it did not map cleanly to the Dictionary, and what consortium-level reporting would do as a workaround. That transparency turned out to be more valuable than forcing a mapping that would have been wrong.
- Why pin translations as aliases rather than separate concepts?
- To avoid concept drift. If we had made "directeur de recherche" a separate Dictionary concept from "principal investigator", we would eventually have had two definitions that diverged. Treating translations as localised labels of one underlying concept means there is exactly one definition that gets translated, not seventeen.
- What would you do differently?
- Start the translation work in month one, not month seven. We sequenced vocabulary first, translation second, on the assumption that translation would be cheap and fast. It was neither. If we had started translation in parallel from day one we would have shaved three months off the overall programme.
Cited CASRAI resources
Internal CASRAI resources referenced
- CASRAI Dictionary overview
- Dictionary: CRIS
- Dictionary: ROR
- Dictionary: Principal Investigator
- Federation: euroCRIS / CERIF
- Federation: cross-walks (DataCite, MARC 21, ORCID, Schema.org)
- CRediT translations
- Horizon Europe CRediT statement guidance
- ERC CRediT statement guidance
- CRIS integration notes
- Implement: Elsevier Pure
- Implementation checklist: projects and funding








