Tag: research information management

  • Research information management maturity: planning and implementing a CRIS

    Sooner or later, most research-performing organisations decide they need a better grip on their research information — on who is doing what research, what outputs result, what funding supports it, and how it all connects. The usual answer is a current research information system (CRIS): the central system that brings together information about an institution’s research activity. But the most common mistake in adopting one is to treat it as a procurement exercise — a matter of choosing and installing software. A CRIS is only as good as the data that flows into it, the governance that keeps that data trustworthy, and the people and processes that sustain it. Whether an implementation succeeds depends less on the product chosen than on the institution’s research-information-management (RIM) maturity. This article examines what that maturity involves, drawing on the research information systems domain of the CASRAI Dictionary and on our wider learning resources.

    What RIM maturity means

    Research information management maturity describes how developed and capable an institution is at managing information about its research — not in terms of the software it owns, but its practices. A mature organisation has clarity about what research information it holds and why; it has agreed definitions and consistent data; it has governance that assigns responsibility for quality; and it has a culture in which keeping that information accurate is a normal, valued activity rather than a periodic scramble. A less mature organisation may have data scattered across spreadsheets and disconnected systems, defined differently in each, owned by no one in particular, and trusted by few. The concept is useful because it shifts the question from “which system should we buy?” to “how ready are we to use one well?”. A CRIS dropped into an immature environment tends to automate existing confusion; one built on solid foundations can be transformative.

    Data governance as the foundation

    At the centre of RIM maturity lies data governance: the framework of policies, responsibilities and processes that determines how research information is defined, who is accountable for it, and how its quality is maintained. Governance answers the unglamorous but decisive questions on which a CRIS depends. What exactly do we mean by a “publication” or a “project”? Who ensures a researcher’s outputs are recorded correctly? How do we resolve conflicting records of the same thing? What is our authoritative source for each kind of information? Without answers, a CRIS becomes a tidy-looking container for untrustworthy data, and the reports it produces — for funders, for assessment, for management — cannot be relied upon. Strong data governance is what makes the information in a CRIS trustworthy, and trustworthiness is the entire point. Establishing governance is therefore not a step that follows implementation; it is the foundation on which a successful implementation is built.

    Interoperability and the role of CERIF

    A CRIS does not, and should not, stand alone. It needs to exchange information with many other systems — institutional repositories, human-resources and finance systems, funder platforms, persistent-identifier registries and national infrastructures. This makes interoperability a central concern, and it is where shared standards become essential. The Common European Research Information Format (CERIF) is a standard data model for research information, developed to enable research-information systems to exchange data in a consistent, structured way. By describing research entities — people, projects, outputs, organisations, funding — and their relationships in a common model, CERIF allows information to move between systems without being garbled or lost. The standard is maintained and promoted by euroCRIS, the international organisation dedicated to research information and the CRIS community. Choosing a CRIS without attention to interoperability is choosing a future island; designing for exchange from the start, with standards like CERIF in mind, is what lets a CRIS take its place in a connected ecosystem rather than becoming another silo.

    Planning an implementation

    With maturity, governance and interoperability understood, a CRIS implementation can be approached as the organisational change it really is. Several considerations recur:

    • Start with purpose. Be clear about what the CRIS is for — reporting, assessment, profiles, discovery — because purpose drives every later decision about scope and data.
    • Audit existing information. Understand what data the institution already holds, where it lives, how good it is and who owns it, before bringing it together.
    • Establish governance early. Agree definitions, responsibilities and authoritative sources before loading data, not after.
    • Design for interoperability. Plan how the CRIS will connect to repositories, identifiers and external systems, using shared standards from the start.
    • Invest in people and process. A CRIS changes how researchers and administrators work; engagement, training and clear processes matter as much as the technology.

    Connecting to the wider ecosystem

    A well-implemented CRIS becomes a hub that links an institution’s research information to the world beyond it — feeding national assessment exercises, exchanging data with funders, connecting to repositories, and using persistent identifiers to disambiguate people and organisations. The richer this connectivity, the more value the CRIS delivers and the less manual re-entry it demands. This is why interoperability standards, persistent identifiers and the federation of research information — explored in our resources on research administration — are not optional extras but the very things that turn a CRIS from an internal database into part of a connected scholarly infrastructure.

    A consistent vocabulary beneath it all

    Every theme here — consistent data, governance, interoperability, exchange with other systems — depends on one requirement: that the elements of research information mean the same thing across systems and institutions. A “project”, an “output” or a “contributor role” must be defined compatibly, or no amount of technology will make the data line up. That consistency is what the CASRAI Dictionary provides: a shared vocabulary so that the information held in and exchanged by a CRIS is understood identically wherever it travels. And because so much of what a CRIS records is contribution, that contribution can be described in the same shared framework, the CRediT taxonomy. A CRIS is not bought; it is grown, on foundations of mature practice, sound governance, shared standards and a common vocabulary.

  • What a CRIS does: the research-information backbone explained

    Most universities run a system that quietly underpins a great deal of their research administration, and most researchers could not name it. It is the Current Research Information System (CRIS) — the institutional backbone that ties together who the researchers are, what projects they run, who funds them, and what they produce. This article gives a plain-language account of what a CRIS does, why it matters, and why it depends so heavily on shared vocabulary. It draws on the research-information systems domain.

    CRIS and RIM: the system and the function

    Two terms travel together and are easily confused. A CRIS is the software system. Research Information Management (RIM) is the broader discipline and practice of managing research information — the function that the CRIS supports. RIM is what a research office does; the CRIS is the tool it uses to do it. Both terms appear because the same activity is described from two angles: the operational system and the professional practice. Familiar CRIS products include Pure, Symplectic Elements, Converis, Worktribe, and the open-source VIVO and DSpace-CRIS.

    What a CRIS actually holds

    A CRIS is, at heart, a set of connected records about a handful of entity types and the relationships between them. The core entities are people, organisational units, projects, funding, and outputs. The value is in the connections: this researcher, in this department, leads this project, funded by this award, which produced these publications and datasets. Each entity is a record; the CRIS is the graph that joins them.

    The researcher profile is the entity most people encounter. It aggregates a person’s affiliations, outputs, projects, and activities into a single record — the thing that often surfaces as a public staff page. Behind it sits an organisational hierarchy: the structured representation of departments, schools, institutes, and centres, so that the system can roll outputs and funding up to any level of the institution. The quality of that hierarchy determines whether “how much did the School of Engineering publish last year?” is a one-click query or a week of manual work.

    The core job: getting data in

    A CRIS is only as useful as the data in it, and the central operational challenge is keeping that data current without burying researchers in data entry. Two mechanisms do most of the work. A publication harvest automatically imports publication metadata from external sources — Crossref, Scopus, Web of Science, PubMed, ORCID — so that a researcher’s output list populates itself rather than being typed in. A funder ingest imports funding and award metadata, so that grants appear against the right people and projects.

    Neither mechanism is reliable without identifiers. A publication harvest that matches on author name alone will mis-assign work by every researcher who shares a surname; matching on ORCID iD resolves the person unambiguously. A funder ingest that matches on institution name will fragment one university across a dozen spelling variants; matching on ROR ID collapses them to one. This is why the maturation of the persistent-identifier ecosystem has done more for CRIS data quality than any feature in the software itself.

    Disambiguation, enrichment, validation

    Three less-visible activities determine whether a CRIS is trusted. Disambiguation is the process of resolving ambiguous identifications — two authors with the same name, two spellings of one organisation — to canonical entities. Enriched metadata is metadata improved with information from external sources: adding Crossref Funder Registry IDs to funding records, adding ROR IDs to affiliations, adding DOIs to outputs that arrived without them. A validation rule is a check applied during ingest to enforce data quality — rejecting a publication record with no identifier, flagging an award whose dates fall outside its project. Together these turn a heap of imported records into a research-information asset an institution can report from with confidence.

    What the CRIS is for

    The reason institutions invest in a CRIS is that the same research-information facts are needed, repeatedly, for many different purposes. Annual reporting, research assessment exercises, open-access compliance monitoring, public staff and project pages, internal resource allocation, and responses to funder audits all draw on the same underlying entities. Without a CRIS, each of these is a separate data-gathering exercise; with one, they are views over a single maintained graph. The CRIS is the institution’s single source of truth for research information, and its value is exactly proportional to how trustworthy that single source is.

    This is also why a CRIS connects outward. It is not an island: it harvests from Crossref and ORCID, it can push validated publications to a repository, it feeds open-access compliance dashboards, and increasingly it exchanges project information using shared models. A modern CRIS is a node in an institutional and sectoral information fabric, not a closed database.

    Why shared vocabulary is the precondition

    Here is the catch that connects the CRIS to CASRAI’s mission. Every CRIS implementation that invents its own field names — its own way of recording an ethics status, an output type, a project phase, a funding category — creates a system that cannot exchange data cleanly with any other. The harvests work because Crossref, ORCID, and ROR provide shared identifiers and shared metadata. The internal records often do not interoperate, because each institution structured them locally. A controlled, shared vocabulary for the entities and attributes a CRIS holds is what would let research information move between institutions as cleanly as it now moves in from the identifier providers. Supplying that definitional layer is the convening role the CASRAI dictionary exists to play.

    What to do now

    For institutions running a CRIS: invest in identifiers first — ORCID and ROR adoption do more for data quality than any feature. Treat disambiguation, enrichment, and validation as ongoing operations, not one-off projects. For those procuring or integrating systems: use vendor-neutral, shared vocabulary to specify what you need, so the conversation is about your requirements rather than one product’s field names.

    Related reading