Tag: Software Carbon Intensity

  • Digital sustainability: the environmental cost of data storage and preservation

    The instinct in modern research is to keep everything. Storage is cheap, deletion feels risky, and the principles of openness and reproducibility seem to counsel retaining as much as possible for as long as possible. But this instinct conceals a real and growing cost. Storing data, running computations and preserving digital material for the long term all consume energy, and energy carries a carbon footprint. The cloud is not a weightless abstraction; it is data centres drawing power and demanding cooling, somewhere, continuously. As research becomes ever more data-intensive, the environmental cost of its digital life — storage, computation, preservation — can no longer be treated as invisible. Digital sustainability is the discipline of taking that cost seriously, and it is the subject of this article, which draws on the sustainable-research domain of the CASRAI Dictionary.

    The hidden cost of keeping everything

    The first thing digital sustainability asks us to see is that “keep it just in case” is not a cost-free default. Every dataset retained indefinitely occupies storage that must be powered, cooled, maintained, migrated to new media over time, and backed up — and the aggregate of countless such decisions across the research system is substantial. There is a real tension here with the open-data ideal. The drive to make data findable and reusable is valuable, but it can shade into digital hoarding: keeping vast quantities of low-value data on the vague principle that more is always better, without asking whether a dataset is worth its ongoing cost. The FAIR principles call for data to be findable and reusable — not for everything to be kept forever regardless of value. Distinguishing data worth preserving from data that need not be is itself an act of stewardship, not a betrayal of openness.

    Appraisal and data minimisation

    The practices that respond to this are appraisal and data minimisation. Appraisal — long established in the archival and records-management traditions — is the disciplined process of deciding what to keep, for how long, and what may responsibly be discarded, based on enduring value rather than reflex. Data minimisation, familiar also from data protection, is the principle of collecting and retaining only what is genuinely needed. Applied to research, these practices mean making conscious decisions: which raw data must be preserved to support published results and which intermediate files can be regenerated if ever needed; which datasets have lasting reuse value and which were transient. This is not an argument for carelessly deleting valuable data — the cost of losing irreplaceable data far exceeds the cost of storing it. It is an argument for deciding, deliberately and well, rather than defaulting to indiscriminate retention. Good appraisal keeps what matters and lets go of what does not, serving both sustainability and the long-term usability of the record.

    Green software and computation

    Storage is only part of the picture; computation has its own footprint. The green software movement — advanced by organisations such as the Green Software Foundation — aims to reduce the environmental impact of software itself. A central concept is Software Carbon Intensity (SCI), a specification for measuring the carbon emissions associated with running software, so that the impact can be quantified, compared and reduced rather than guessed at. For research, the principles translate into practical questions: is a computation more efficient than it needs to be; is it run repeatedly when results could be cached; is the workload run where and when the energy is cleaner? Efficient, well-considered computation is not only cheaper and faster but less carbon-intensive, and measuring impact, as SCI encourages, is the precondition for managing it.

    Preservation that lasts: OAIS

    Sustainability is not only about using less; it is also about preserving well, so that what is kept genuinely endures and the energy spent keeping it is not wasted. The reference model for long-term digital preservation is OAIS — the Open Archival Information System reference model — which provides a framework for what a trustworthy digital archive must do to preserve information over the long term and keep it accessible and understandable to future users. OAIS matters to digital sustainability in two ways. First, preservation is itself an ongoing activity with an environmental cost, and doing it according to a sound model means that cost buys real durability rather than slow decay. Second, preserving fewer things well — properly described, in sustainable formats, in a trustworthy archive — is far better, environmentally and intellectually, than preserving many things badly, where data accumulates and yet quietly becomes unusable through neglect. Good preservation and disciplined appraisal are two sides of the same sustainable practice.

    Sustainability and FAIR, properly understood

    None of this is in conflict with FAIR or with open research, properly understood. FAIR is about good stewardship — making the data that is worth keeping findable, accessible, interoperable and reusable — not about hoarding. A sustainable approach is, in fact, a more honest expression of FAIR: it concentrates effort on the data that genuinely merits it, rather than spreading thin attention and real resources across everything indiscriminately. Sustainability and good data stewardship point in the same direction: keep what matters, describe it well, preserve it properly, and let go of what does not earn its keep.

    A consistent vocabulary for digital sustainability

    For sustainable practice to be applied consistently — across repositories, institutions and funders — the concepts involved, such as retention periods, appraisal decisions, preservation levels and format requirements, must be described in ways that mean the same thing everywhere. That consistency is what the CASRAI Dictionary works towards: a shared vocabulary so that decisions about what to keep, how to preserve it and for how long are understood the same way wherever they are recorded. And because appraising, curating and preserving data well is genuine, skilled work, it can be described in the same shared framework as any other contribution — the CRediT taxonomy and the wider apparatus of research administration. The most sustainable digital research is not the research that stores the least, but the research that decides most carefully what is worth keeping — and then keeps it well.

  • Greenhouse-gas emissions reporting for research institutions: Scopes 1, 2 and 3

    Research is an energy-intensive activity. Laboratories run power-hungry equipment around the clock, computing clusters draw enormous amounts of electricity, and the scholarly enterprise is bound together by a culture of international travel to conferences and collaborations. For a long time these costs were invisible in environmental terms — counted, if at all, only as line items in a budget. That is changing. Funders, institutions and researchers are beginning to ask what the carbon footprint of research actually is, and to account for it with something approaching the rigour they already apply to data and finance. This article surveys how greenhouse-gas emissions reporting is taking shape for research institutions, drawing on the sustainable-research domain of the CASRAI Dictionary.

    The GHG Protocol and its three scopes

    The dominant framework for emissions accounting, in research as everywhere else, is the Greenhouse Gas Protocol. Its central contribution is a way of organising an organisation’s emissions into three scopes, which together prevent both double-counting and convenient omission. Scope 1 covers direct emissions from sources the organisation owns or controls — fuel burned on site, institutional vehicles, and the gases released by certain laboratory processes. Scope 2 covers indirect emissions from the energy the organisation buys, above all purchased electricity — the emissions produced by the power station, attributed to the institution that consumes the power. Scope 3 covers all other indirect emissions across the value chain: business travel, commuting, procured goods and services, waste, and the embodied carbon of the equipment and supplies an institution buys. For most research organisations, Scope 3 is by far the largest and the hardest to measure — and it is where research has some of its most distinctive emissions.

    The carbon cost of conferences and travel

    Academic culture has long treated frequent international travel as normal and even as a marker of seniority and engagement. Under emissions accounting, that travel becomes visible as a substantial Scope 3 source. Flying researchers across the world to present work and attend meetings carries a real carbon cost, and the recognition of this has driven genuine change in practice: more virtual and hybrid conferences, more deliberate choices about which trips are truly necessary, and policies that encourage rail over short-haul flights where feasible. The point is not to end scholarly exchange — collaboration and the meeting of minds are essential to research — but to make its environmental cost a conscious factor rather than an unexamined habit. Measuring travel emissions is the first step towards managing them.

    Laboratory energy and sustainable lab frameworks

    The other distinctive source is the laboratory itself. Ultra-low-temperature freezers, fume hoods, autoclaves and specialist instruments make labs among the most energy-intensive spaces in any institution, often consuming several times more energy per unit area than ordinary offices. Two frameworks have become prominent in addressing this. LEAF (the Laboratory Efficiency Assessment Framework) offers laboratories a structured set of actions and an accreditation scheme to reduce their environmental impact — covering energy, waste, water, procurement and sample storage — while also saving money. My Green Lab provides certification and standards for sustainable laboratory practice, working with the scientific community and suppliers to drive improvement. Both turn the abstract goal of a greener lab into concrete, assessable steps, and both give researchers a recognised way to demonstrate that their practice has improved.

    The carbon cost of computing

    As research becomes ever more computational, the emissions of computing demand their own attention. Large-scale data analysis, simulation and especially the training of machine-learning models can consume very large amounts of electricity, and the associated emissions depend heavily on when and where the computation runs — on the carbon intensity of the electricity grid powering the data centre at that moment. The Software Carbon Intensity (SCI) specification, developed within the Green Software Foundation, provides a methodology for calculating the carbon intensity of a software application, expressing emissions per unit of useful work. For research computing, frameworks like this make it possible to measure and compare the carbon cost of computational work, and they point towards practical responses — running flexible workloads when and where the grid is cleaner, and choosing efficient methods — so that the growth of computational research does not silently inflate its footprint.

    Why measurement comes first

    Running through all of this is a simple principle: you cannot manage what you do not measure. The value of the GHG Protocol’s scopes, of LEAF and My Green Lab, and of the SCI specification is that they make the carbon cost of research visible and comparable. Once an institution can see that its travel dominates its footprint, or that its freezers or its computing draw disproportionate energy, it can act with proportion rather than guesswork. And once emissions are measured in standard ways, they can be compared across institutions and over time, progress can be demonstrated to funders increasingly interested in sustainability, and good practice can be recognised. Sustainable research is not a matter of gesture; it rests on honest accounting.

    A consistent vocabulary for sustainability data

    For emissions data to be compared and aggregated across institutions, funders and reporting frameworks, the terms involved must mean the same thing everywhere — what counts as Scope 1, 2 or 3, how travel and procurement emissions are categorised, what a sustainability metric refers to. That consistency is what the CASRAI Dictionary works towards: a shared vocabulary so that the sustainability information flowing through institutional and funder reporting is understood identically wherever it appears. And because sustainable practice is increasingly part of how research is conducted and assessed, the work of greening a lab or reducing computational impact sits alongside the other contributions captured in frameworks such as the CRediT taxonomy and its full set of contribution roles. As research confronts its own environmental impact, the discipline of measurement — the same instinct that produced sound data management — is what turns concern into change. Institutions wanting to integrate this into their operations will find it sits naturally within wider research administration.