Dictionary domainTrack B

Research data infrastructure

Trusted repositories, EOSC, biobanks, data trusts, federated infrastructure.

Implementation guide →Working group →Editorial in this domain →

For implementers

Operational deployment checklist for Research data infrastructure: prerequisites, five deploy steps, integration notes for Pure, Symplectic Elements, Worktribe, DSpace, and more, plus the pitfalls that recur in the field.

View implementation checklist →

Terms in this domain

43 terms

Dictionary termStable

Aggregator service

A service that harvests, harmonises, and re-exposes metadata and (sometimes) content from many upstream sources, providing a unified search, browse, or query interface across the aggregated corpus; canonical examples include OpenAIRE, BASE, CORE, and OpenAlex.

Research data infrastructure· Identifiers→

Dictionary termStable

Data safe haven

A secure data-handling environment that allows controlled, audited access to sensitive datasets for approved research, applying technical, physical, and procedural safeguards; effectively a synonym for trusted research environment (TRE) in much current usage, though the term has older roots in NHS information governance.

Research data infrastructure· Identifiers→

Dictionary termStable

Five Safes framework

A framework for the safe use of sensitive data in research, articulated by the UK Office for National Statistics, that organises controls under five dimensions: Safe People, Safe Projects, Safe Settings, Safe Data, and Safe Outputs.

Research data infrastructure· Identifiers→

Dictionary termStable

Trusted research environment

A secure computing environment — typically delivered as a remote-access workspace with controlled inbound/outbound data flows — that allows accredited researchers to analyse sensitive data in situ without exporting the data, supporting privacy-preserving secondary research use.

Research data infrastructure· Identifiers→

Dictionary termStable

Sensitive-data repository

A repository specifically designed to hold sensitive research data — typically personal data, health data, criminal-justice data, commercially-confidential data, or culturally-sensitive Indigenous data — with enhanced access controls, audit logging, contractual access conditions, and (often) a secure analysis environment.

Research data infrastructure· Identifiers→

Dictionary termStable

Dataset landing page

The human-readable web page that a dataset's persistent identifier (typically a DataCite DOI) resolves to, presenting the dataset's title, creators, description, identifiers, dates, version history, related works, access conditions, and a link to download or request the data.

Research data infrastructure· Identifiers→

Dictionary termStable

Joint Declaration of Data Citation Principles

The 2014 statement produced by Force11's Data Citation Synthesis Group, signed by a wide community of publishers, funders, repositories, and infrastructure providers, that articulates eight principles for the citation of research data in scholarly communication.

Research data infrastructure· Identifiers→

Dictionary termStable

Data citation principle

Any of the eight principles articulated in the Joint Declaration of Data Citation Principles (Force11, 2014) covering importance, credit and attribution, evidence, unique identification, access, persistence, specificity and verifiability, and interoperability and flexibility of data citations in scholarly communication.

Research data infrastructure· Identifiers→

Dictionary termStable

Data publication platform

A platform that supports the publication of research data as a citable artefact — assigning a persistent identifier, presenting a landing page, and applying review, curation, or peer-review processes — distinct from purely depositional storage.

Research data infrastructure· Identifiers→

Dictionary termStable

Domain repository

Synonym for discipline-specific repository: a repository whose scope is a particular research domain (or domain-sub-area), with curation practices and metadata tailored to that domain.

Research data infrastructure· Identifiers→

Dictionary termStable

Generalist repository

A repository that accepts research outputs from any discipline, applying domain-agnostic curation and discovery, and serving as a deposit destination for outputs that have no natural discipline-specific home or whose authors prefer a single multidisciplinary venue.

Research data infrastructure· Identifiers→

Dictionary termStable

Discipline-specific repository

A repository whose scope is bounded to a particular research discipline or sub-discipline, with curation practices, metadata schemas, and community standards tailored to that domain's data types, terminologies, and norms.

Research data infrastructure· Identifiers→

Dictionary termStable

FAIRsharing (concept)

A curated, community-driven registry of databases, standards (metadata, identifiers, formats, terminologies), and data policies relevant to research data, maintained at the University of Oxford with linkage to funders, journals, and standards organisations.

Research data infrastructure· Identifiers→

Dictionary termStable

Re3data (concept)

Registry of Research Data Repositories: a global registry, operated by DataCite and partner institutions, that lists research data repositories worldwide with descriptive metadata about their disciplines, content types, access conditions, and policies, helping researchers locate suitable repositories for deposit and discovery.

Research data infrastructure· Identifiers→

Dictionary termStable

UK Data Service (concept)

A UK ESRC-funded data infrastructure that holds, curates, and provides access to social, economic, and population data resources for research, learning, and policy, comprising the UK Data Archive at the University of Essex and partner institutions.

Research data infrastructure· Identifiers→

Dictionary termStable

ICPSR (concept)

Inter-university Consortium for Political and Social Research: a consortium-membership-funded data archive based at the University of Michigan that holds and curates over 10,000 social-science research datasets, providing access to member institutions worldwide.

Research data infrastructure· Identifiers→

Dictionary termStable

Harvard Dataverse (concept)

A free research-data repository operated by Harvard University on the open-source Dataverse software platform, accepting datasets from researchers worldwide, minting DataCite DOIs, and serving as the flagship instance of the global Dataverse network.

Research data infrastructure· Identifiers→

Dictionary termStable

Dryad (concept)

A non-profit generalist research data repository operated by Dryad Data Inc. (in partnership with the California Digital Library) that publishes peer-reviewed-paper-linked datasets, mints DataCite DOIs, and applies curation review before publication.

Research data infrastructure· Identifiers→

Dictionary termStable

Figshare (concept)

A commercial generalist research repository operated by Digital Science that accepts datasets, figures, presentations, papers, software, and other research artefacts, minting DataCite DOIs and offering institutional-branded instances ('Figshare for Institutions') alongside the public service.

Research data infrastructure· Identifiers→

Dictionary termStable

Zenodo (concept)

A free generalist research repository operated by CERN and developed under OpenAIRE that accepts deposits of datasets, software, publications, presentations, posters, and other research artefacts, minting DataCite DOIs and providing free preservation up to a per-record size limit.

Research data infrastructure· Identifiers→

Dictionary termStable

GitHub mirror

A copy of a Git repository (or set of repositories) hosted on GitHub that tracks an upstream source repository elsewhere, typically maintained for redundancy, visibility, or community-engagement reasons rather than as the canonical primary copy.

Research data infrastructure· Identifiers→

Dictionary termStable

Software Heritage archive

A non-profit international initiative based at Inria that systematically crawls, archives, and preserves the world's publicly available source code, including its full version-control history, and issues persistent identifiers (Software Hash Identifiers, SWHIDs) to every archived artefact.

Research data infrastructure· Identifiers→

Dictionary termStable

Code repository

A version-controlled storage location for source code, typically operated on top of a distributed version-control system such as Git, exposing the code's full revision history, branches, tags, and (often) collaboration features such as issues, pull requests, and code review.

Research data infrastructure· Identifiers→

Dictionary termStable

Tissue bank

A specific kind of biobank focused on the collection, processing, storage, and distribution of human tissue samples (typically solid tissue specimens from surgical or post-mortem sources), governed under tissue-banking regulation in the relevant jurisdiction.

Research data infrastructure· Identifiers→

Dictionary termStable

Sample repository

A repository for physical research samples — geological, environmental, biological, or material — that catalogues, stores, and provides access to samples for downstream analysis, often issuing persistent identifiers (IGSN, DataCite DOI) for citation and provenance tracking.

Research data infrastructure· Identifiers→

Dictionary termStable

Biorepository

A facility or organisation that collects, processes, stores, and distributes biological materials and their associated data for research, encompassing both human and non-human samples, distinguished from a 'biobank' by usage in some communities to denote broader scope or specific research projects.

Research data infrastructure· Identifiers→

Dictionary termStable

Biobank

An organised collection of biological samples (typically human samples such as blood, tissue, DNA, urine) together with their associated clinical, demographic, and lifestyle data, governed for use in biomedical research.

Research data infrastructure· Identifiers→

Dictionary termStable

National data infrastructure

A coordinated, nationally-scoped programme and set of services for the storage, sharing, and reuse of research data within a country, typically combining funding policy, technical infrastructure (repositories, compute, federation), training, and governance.

Research data infrastructure· Identifiers→

Dictionary termStable

Data hub

A central node in a data ecosystem that aggregates, harmonises, and brokers access to data from multiple upstream sources, exposing the harmonised data to downstream consumers via curated APIs, query interfaces, or download endpoints.

Research data infrastructure· Identifiers→

Dictionary termStable

Federated data infrastructure

A data infrastructure in which data, services, and access controls remain distributed across multiple independent nodes (typically operated by different organisations) but are made discoverable, queryable, and usable as a unified resource through shared protocols, vocabularies, and identity-federation.

Research data infrastructure· Identifiers→

Dictionary termStable

Data warehouse

A central repository of structured data, integrated from multiple operational sources, modelled for analytical querying (typically with a star or snowflake schema), and optimised for read-heavy workloads supporting reporting and decision-making.

Research data infrastructure· Identifiers→

Dictionary termStable

Data lake

A storage repository that holds large volumes of structured, semi-structured, and unstructured data in their native formats, deferring schema-on-write requirements so that data can be ingested cheaply and only structured at the time of read or analysis.

Research data infrastructure· Identifiers→

Dictionary termStable

Data commons

A shared data resource — often combined with shared computing and analysis tools — governed by a community under defined access and contribution rules, designed to enable many users to use and add to the resource for collective benefit.

Research data infrastructure· Identifiers→

Dictionary termStable

Data trust

A legal and organisational structure in which a fiduciary intermediary holds, governs, and brokers access to a body of data on behalf of its contributors and beneficiaries, applying agreed terms of access, use, and accountability.

Research data infrastructure· Identifiers→

Dictionary termStable

World Data System certification

Historic certification programme of ICSU's World Data System (WDS) under which scientific data centres in geosciences and related fields were certified as trustworthy; merged with the Data Seal of Approval in 2017 to form CoreTrustSeal.

Research data infrastructure· Identifiers→

Dictionary termStable

CoreTrustSeal

A community-based, non-profit certification scheme for trustworthy data repositories, operated by the CoreTrustSeal Foundation, awarded against 16 published requirements covering organisational infrastructure, digital object management, and technical infrastructure.

Research data infrastructure· Identifiers→

Dictionary termStable

Trusted digital repository

A digital repository whose mission, governance, technical infrastructure, and procedures have been independently assessed against a recognised standard (e.g. CoreTrustSeal, nestor seal, ISO 16363) and judged trustworthy to preserve digital content over the long term.

Research data infrastructure· Identifiers→

Dictionary termStable

Subject repository

A repository the contents of which are connected purely by their discipline, rather than by other factors such as their institutional affiliation (see Institutional Repository)

Research data infrastructure· Identifiers→

Dictionary termStable

Researcher webpage

A webpage featuring a researcher's profile, which possibly may also provide links to their publications.

Research data infrastructure· Identifiers→

Dictionary termStable

Repository

Repositories preserve, manage, and provide access to many types of digital materials in a variety of formats.

Research data infrastructure· Identifiers→

Dictionary termStable

Open archive

A repository that is compliant with the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) and therefore facilitates the sharing of metadata for a variety of purposes, most notably the compilation tasks performed by aggregator databases.

Research data infrastructure· Identifiers→

Dictionary termStable

Institutional webpage

A webpage that is associated with the institution at which the author is employed.

Research data infrastructure· Identifiers→

Dictionary termStable

Institutional repository

An online, digital collection of research outputs (see Repository) that are connected by their affiliation with a specific institution. Institutional repositories are most commonly associated with universities and other academic organisations, and so the contents of a single institutional repository may therefore cover a range of disciplines. An institutional repository may often be managed as part of a wider suite of services supporting scholarly communication, Open Access and Open Education.

Research data infrastructure· Identifiers→