Skip to main content
v2026.1714 entries · CC-BY 4.0
CASRAI

Editorial category Track B

Research data infrastructure

Trusted repositories, EOSC, biobanks, data trusts, federated infrastructure.

  • 21 June 2026

    Identifiers for Things, Not Just Papers: IGSN and PIDINST

    Persistent identifiers are familiar for articles, datasets, and people, but the physical objects of research, the rock cores, water samples, and the instruments that measure them, have long lacked stable references. The IGSN for samples and the PIDINST work for instruments extend persistent identification to the physical world, making physical research objects findable, citable, and connectable to the data they produce.

  • 21 June 2026

    Anonymising research data: k-anonymity, differential privacy and the re-identification risk

    Sharing data about people without exposing the people themselves is one of the hardest problems in research data management. This article distinguishes anonymisation from pseudonymisation, explains the privacy models researchers actually use, k-anonymity, l-diversity and differential privacy, and introduces the practical guidance from the UK Anonymisation Network (UKAN) and the ICO’s anonymisation code. It also confronts the uncomfortable reality that re-identification is often easier than it looks.

  • 20 June 2026

    Big Data and the Vs of Data Explained for Research

    Big data describes datasets so large, fast or varied that traditional tools cannot handle them. This guide explains the defining Vs, from volume and velocity to veracity and value, how distributed processing copes, and what big data means for research and FAIR data.

  • 20 June 2026

    Cloud Computing for Research Infrastructure

    Cloud computing delivers on-demand, elastic, measured computing resources over a network. This explainer defines it using the NIST model, distinguishes IaaS, PaaS and SaaS, and weighs its role in reproducible research alongside cost and governance considerations.

  • 20 June 2026

    Open Data in Public-Health Research: Standards

    Open and FAIR data principles are reshaping how public-health research data are shared and reused. This guide explains data-sharing standards, anonymisation at a high level, metadata, persistent identifiers, and the governance that enables responsible reuse.

  • 20 June 2026

    Incidence vs Prevalence: Key Epidemiological Measures

    Incidence counts new cases over time, while prevalence counts existing cases at a point or over a period. This guide defines each measure, shows how they are calculated, explains the relationship between them, and clarifies when to use which.

  • 19 June 2026

    Death Rate and Mortality Statistics: Definitions

    A death rate measures deaths relative to a population, but a crude rate and an age-standardised rate answer different questions. This guide defines both, explains why standardisation is needed for comparison, and outlines cause-of-death coding and data sources.

  • 18 June 2026

    Genomic Data-Sharing Standards: GA4GH and Responsible Access Explained

    Genomic data sharing relies on common standards for formats, metadata, consent and controlled access. This guide explains the role of the Global Alliance for Genomics and Health, FAIR principles and controlled-access archives in moving genetic data responsibly.

  • 18 June 2026

    Amino Acids: Notation, Protein Data and How Sequences Are Recorded

    Amino acids are the chemical building blocks of proteins. This guide explains the 20 standard amino acids, their one- and three-letter notation, and how protein sequence and structure data are recorded and shared through UniProt and the Protein Data Bank.

  • 18 June 2026

    FAIR Principles for Research Data Explained

    The FAIR principles make research data Findable, Accessible, Interoperable and Reusable. Published by Wilkinson et al. in 2016, they emphasise persistent identifiers and rich metadata. This explainer defines each principle and clarifies how FAIR differs from open data.

  • 18 June 2026

    Census and Population Data: Sources and Standards

    A census is a complete enumeration of a population at a defined moment, the backbone of official population statistics. This guide explains how the data are collected and standardised, the roles of national statistics offices and the UN, and the shift toward register-based methods.

  • 18 June 2026

    Life Expectancy: How It Is Calculated and the Data Behind It

    Life expectancy is the average number of years a population is expected to live, estimated from a life table built on mortality data. This guide explains the method, the difference between period and cohort measures, and the data sources involved.

  • 15 June 2026

    Open scholarly infrastructure and the POSI principles

    The identifiers, registries and services that hold scholarship together are infrastructure — and infrastructure can fail, lock in, or disappear. The POSI principles set out how the organisations that run it can be made durable and accountable.

  • 15 June 2026

    FAIR data in practice: making research data findable and reusable

    FAIR is widely cited and often misunderstood. What Findable, Accessible, Interoperable and Reusable actually require in practice, why FAIR is not the same as open, and the concrete steps that move a dataset from a hard drive to a genuinely reusable output.

  • 15 June 2026

    Licensing research data: CC-BY, CC0 and when to use each

    A dataset without a clear licence is data nobody can confidently reuse. The difference between CC0 and CC BY for data, why software needs different licences, and how to choose a licence that makes your data genuinely reusable.

  • 13 June 2026

    Trusted repositories and the EOSC: where research data should live

    FAIR data has to live somewhere, and not every place is fit to hold it. Trusted digital repositories, CoreTrustSeal certification and the European Open Science Cloud set out where research data should be deposited.

  • 12 June 2026

    Repository certification: CoreTrustSeal and the markers of a trustworthy repository

    Where should research data live so that it is still usable in twenty years? Repository certification — CoreTrustSeal, the TRUST Principles, the nestor Seal and ISO 16363 — gives concrete, auditable answers about what makes a digital repository genuinely trustworthy.

  • 11 June 2026

    Data citation: giving datasets the credit they deserve

    Datasets underpin findings but are rarely cited as first-class objects. How DataCite DOIs, the FORCE11 Joint Declaration of Data Citation Principles and the CRediT Data curation role together make data both citable and creditable.

  • 11 June 2026

    Data availability statements: what to write and where to deposit

    A data availability statement is now a routine journal requirement, but “available on request” statements rarely deliver. What to write, how to make data genuinely FAIR, and how to choose between generalist and domain repositories.

  • 10 June 2026

    DataCite and the data-citation infrastructure

    Articles have long had DOIs; datasets, software and other research outputs needed the same. DataCite provides persistent identifiers and a metadata schema that make data first-class, citable, connectable objects in the scholarly record.

  • 31 May 2026

    Finding research data: dataset discovery and data search engines

    Depositing data in a repository is only half the battle — if no one can find it, it might as well not exist. Registries of repositories, dataset search engines and structured metadata are what turn a deposited dataset into a discoverable one.

  • 26 May 2026

    Federated analysis: bringing computation to the data

    Some of the most valuable research data — health records, genomic data, sensitive registries — cannot easily be moved or pooled. Federated analysis flips the usual model: instead of bringing data to the computation, it sends the computation to the data, enabling collaboration without the data ever leaving home.

  • 12 February 2026

    EOSC Federation governance: what changed in 2026

    The 2026 EOSC Federation governance refresh: new node-membership criteria, the sustainability funding model, and how integrators should reposition.

  • 8 January 2026

    FAIR data assessment frameworks: a buyer’s guide for institutions

    RDA Maturity Model, F-UJI, FAIR-Aware, CESSDA, ARDC: which FAIR assessment tool to choose, when, and how to integrate with institutional repositories.

  • 1 January 1970

    Documenting social science data: the Data Documentation Initiative (DDI)

    Survey and microdata are only reusable if every variable, code and question is documented. The Data Documentation Initiative (DDI) is the international metadata standard that makes that possible for social, behavioural and economic data. This article distinguishes DDI Codebook from DDI Lifecycle, explains variable-level metadata, and shows why major archives such as the UK Data Service, ICPSR and GESIS built their holdings on it, and how DDI fits the wider metadata landscape.

  • 1 January 1970

    Choosing Where to Put Your Data: re3data and the Landscape of Research Data Repositories

    Depositing research data in a suitable repository is now an expectation of most major funders, but with thousands of repositories in existence — institutional, disciplinary, and general-purpose — choosing the right one is not straightforward. re3data, the Registry of Research Data Repositories, is the principal tool for navigating this landscape: a searchable, indexed registry of more than 3,000 research data repositories worldwide, operated under the auspices of DataCite. This article examines re3data, its metadata schema and badge system for signalling repository properties such as data access, licensing, and persistent identifier support, the main types of repository, how funders including the NIH, Wellcome, and Horizon Europe point researchers towards repository registries for repository selection, and the complementary role of FAIRsharing as a registry of databases, standards, and policies.

  • 1 January 1970

    FAIR Digital Objects: Making Research Data Machine-Actionable Beyond Metadata

    FAIR principles have transformed expectations around research data sharing, but the original FAIR framework addresses metadata and findability rather than the machine-actionable operations needed to actually use data at scale. The FAIR Digital Object (FDO) framework, developed through the FDO Forum and built on the Digital Object Interface Protocol (DOIP) from the Corporation for National Research Initiatives, extends FAIR by wrapping data objects with typed, machine-executable operations. Where a standard DOI resolves to a landing page for human readers, an FDO exposes typed interfaces that allow software to retrieve, validate, and act on data without human intervention. This article examines the FDO framework, the CNRI Cordra software that implements it, the Research Data Alliance FAIR Digital Objects working group, and practical deployments in the European Open Science Cloud and biodiversity informatics.

  • 1 January 1970

    Linked Data and Knowledge Graphs in Scholarly Research: RDF, SPARQL, Wikidata and ORKG

    Linked data technologies developed by the World Wide Web Consortium — principally the Resource Description Framework and the SPARQL query language — provide a principled way to connect research objects, express relationships between them, and make them queryable by machines as well as humans. Applied to scholarly research, these technologies underpin initiatives such as the Open Research Knowledge Graph at TIB Hannover, Wikidata’s role as a community-maintained scholarly knowledge base, and the widespread adoption of Schema.org’s Dataset markup to make research datasets discoverable by search engines. Understanding how linked data connects papers, datasets, authors, funders and institutions into a navigable web of scholarly knowledge helps research administrators and data infrastructure professionals appreciate the foundations of machine-readable research information systems.

LAC

Partner Deal

LAC Health Supplies Mobile App

Referenced across the research world

University of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logoUniversity of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logo
  • University of Cambridge logo
  • Columbia University logo
  • University of Edinburgh logo
  • Harvard University logo
  • University of Oxford logo
  • Princeton University logo
  • Stanford School of Medicine logo
  • University College London logo
  • ORCID logo
  • Crossref logo

View CASRAI adoption →