Tag: Creative Commons

  • What Is Open Access? Routes, Licences and Policy Explained

    Open access is the practice of making peer-reviewed research literature freely available online, free of most copyright and licensing restrictions, so that anyone may read, download and reuse it. The concept was formally defined by the 2002 Budapest Open Access Initiative and reinforced by the 2003 Berlin Declaration, both of which describe free availability on the public internet alongside the right to reuse work with proper attribution.

    Understanding open access means understanding two things at once: the routes by which an article becomes openly available, and the licences that govern what readers may do with it. The two are related but distinct, and policy frameworks such as Plan S operate across both.

    How open access is defined

    The Budapest Open Access Initiative (BOAI) defined open access as literature that is freely available to read, with users permitted to copy, distribute and reuse it for any lawful purpose, subject only to attribution. The Berlin Declaration added the requirement that a complete version be deposited in at least one suitable repository. Together these established that open access is about more than zero price; it is about removing permission barriers too. This distinction is often summarised as gratis open access (free to read) versus libre open access (free to read and reuse).

    The main routes to open access

    Several established routes describe how a work reaches readers. Each carries different cost and reuse implications for authors and institutions.

    Route Where it is hosted Who typically pays Key characteristic
    Gold The publisher’s journal, immediately open Author or funder, often via an APC Final version of record is open at publication
    Green A repository (institutional or subject) No author fee A version is self-archived, sometimes after an embargo
    Diamond A community or scholar-led journal Neither author nor reader No charges to publish or to read
    Hybrid A subscription journal with an open option Author or funder, via an APC Individual articles are opened within a paywalled title

    The gold, green and diamond routes are explored in depth in our companion guide on green, gold and diamond open access routes explained. Hybrid publishing remains controversial because the same article can be paid for twice, through subscriptions and an article processing charge, an outcome critics call double dipping.

    Creative Commons licences and reuse

    Licensing determines what readers can lawfully do. Most open-access publishing uses Creative Commons (CC) licences, which let authors retain copyright while granting standardised permissions.

    • CC BY permits any reuse, including commercial, with attribution. It is the licence most aligned with the BOAI definition of full reuse.
    • CC BY-SA adds a share-alike condition, so derivatives must carry the same licence.
    • CC BY-NC excludes commercial reuse.
    • CC BY-ND permits redistribution but not derivatives.

    The Directory of Open Access Journals (DOAJ) indexes journals that meet recognised quality and licensing standards, and is widely used to identify reputable fully open-access titles. Choosing a licence is a key decision for any author preparing a submission; our guidance for authors covers how licence choice interacts with funder mandates.

    Embargoes and the green route

    Under the green route, publishers may impose an embargo period during which the self-archived version cannot be made public, typically applied to subscription titles to protect their commercial model. The version permitted is often the accepted manuscript rather than the final published version of record. Repositories listed by institutions and disciplines, such as preprint servers, support this route at no cost to authors.

    Plan S and cOAlition S

    Plan S is a policy initiative launched by a group of research funders organised as cOAlition S. Its central principle is that research resulting from funded grants must be published in compliant open-access venues or made openly available without embargo, with funders willing to cover reasonable costs. Plan S has accelerated interest in fee-free models, and recent developments are tracked in our coverage of Plan S and diamond open access in 2026. The policy sits within the broader knowledge equity agenda, which treats open access as a route to fairer participation in scholarship.

    APCs versus no-fee models

    Article processing charges (APCs) fund gold and hybrid publishing by shifting costs from readers to authors or their funders. While this removes the paywall, it can create a barrier for researchers without grant support, particularly in lower-income settings. Diamond open access avoids charges entirely by funding publication through institutions, libraries or scholarly societies, and is increasingly seen as the most equitable model. The trade-off is sustainability, since diamond journals depend on continuing non-commercial support.

    Frequently asked questions

    Is open access the same as free to read?

    Not exactly. Free to read corresponds to gratis open access. The fuller libre form, defined by the Budapest and Berlin declarations, also grants reuse rights, usually through a Creative Commons licence such as CC BY.

    What is the difference between gold and diamond open access?

    Both make the final version openly available at publication. Gold open access is often funded by an article processing charge, whereas diamond open access charges neither authors nor readers, relying instead on institutional or society support.

    Does Plan S require a specific licence?

    Plan S strongly favours licences that permit full reuse, with CC BY as the default expectation, so that funded research can be read and built upon without permission barriers.

    Where can I check whether a journal is genuinely open access?

    The Directory of Open Access Journals (DOAJ) is a widely used index of vetted fully open-access journals. You can also consult our standards dictionary for definitions of the key terms used across open-access policy.

  • Licensing research data: CC-BY, CC0 and when to use each

    You can deposit a dataset in a trusted repository, describe it with rich metadata, and give it a DOI — and still leave it effectively unusable, because you forgot the one line that tells a reuser what they are allowed to do with it. A dataset without a clear licence is data nobody can confidently build on: a careful researcher, unsure of the terms, will simply not reuse it. Licensing is therefore not a legal afterthought but the part of the data-infrastructure domain that determines whether a deposit delivers the “R” in FAIR at all. This guide explains the main choices — principally CC0 and CC BY — and when each fits.

    Why a licence is the reusability switch

    The FAIR principles ask that data be Findable, Accessible, Interoperable, and Reusable — and reusability rests explicitly on data being “released with a clear and accessible data usage licence”. Without a licence, default copyright and database rights leave the legal status ambiguous, and ambiguity is fatal to reuse: a would-be user cannot tell whether combining your data with theirs, redistributing it, or building a tool on it is permitted. An explicit, standard, machine-readable licence resolves that uncertainty in advance, for everyone, without anyone having to ask. That is why “attach an explicit licence” is the step that turns a findable dataset into a reusable one.

    The two main choices for data

    CC0 — the public-domain dedication

    CC0 is a Creative Commons tool by which the rights-holder waives, to the fullest extent the law allows, all copyright and related rights in the work — placing it as close to the public domain as possible. For data, CC0 means a reuser can use, combine, modify, and redistribute the data with no conditions at all, including no obligation to attribute. This is widely recommended as the default for research data, and for a specific reason: data are routinely aggregated from many sources, and attribution requirements that stack up across hundreds of datasets (“attribution stacking”) can become legally and practically unworkable. CC0 removes that friction entirely and maximises interoperability. Several major data repositories and infrastructures apply CC0 by default for exactly this reason.

    Importantly, CC0 waives legal requirements, not scholarly norms. Citing the data you use remains an academic and ethical expectation regardless of the licence — CC0 simply means that expectation is enforced by the norms of good scholarship rather than by copyright law.

    CC BY — attribution required

    CC BY permits the same broad reuse — use, adaptation, redistribution, including commercially — but on the single condition that the original creator is credited. For data, CC BY is appropriate where attribution matters enough to be a legal condition, or where a funder or institution requires it. It is the most permissive of the conditional Creative Commons licences and is the default for many open-access publications. The trade-off relative to CC0 is precisely the attribution clause: it guarantees credit, but it reintroduces the attribution-stacking problem when many datasets are combined.

    Choosing between them

    • Prefer CC0 for data intended for the widest possible aggregation and reuse, especially where the data will be merged with many other sources. It maximises interoperability and removes legal friction; rely on citation norms for credit.
    • Choose CC BY where attribution must be a legal condition, where a funder or repository mandates it, or where the dataset is a discrete, citable product whose creators need enforceable credit.
    • Be cautious with more restrictive clauses. Non-commercial (NC) and No-Derivatives (ND) terms substantially limit reuse and can render data incompatible with other open data; they are generally discouraged for research data unless a specific ethical or legal constraint demands them.

    Data are not software: a critical caveat

    Creative Commons licences are designed for content — text, images, and data — and Creative Commons itself advises against using them for software. Software has needs that CC licences do not address: patent grants, the distinction between source and compiled code, and copyleft mechanics. For code, use a recognised software licence instead — a permissive one such as MIT, BSD, or Apache 2.0, or a copyleft one such as the GPL. If your deposit bundles a dataset and the code that processes it, licence each part appropriately: a CC licence (or CC0) for the data, an OSI-approved software licence for the code. Conflating the two is one of the most common licensing mistakes in research deposits.

    A practical checklist

    1. Confirm you have the right to licence the data. Check funder terms, any data-sharing agreements, third-party data within your dataset, and — for personal or sensitive data — consent and governance constraints. A licence cannot grant rights you do not hold.
    2. Default to CC0 for data unless there is a positive reason to require attribution; choose CC BY where there is.
    3. Licence software separately with an OSI-approved licence; never put code under a Creative Commons licence.
    4. State the licence explicitly in the deposit metadata and in any data availability statement, using the standard licence identifier so it is machine-readable.
    5. Cite the data you reuse regardless of its licence — the scholarly norm holds even when the law does not require it.

    How this connects to contribution and credit

    Licensing answers “what may be done with this output?”; it is a sibling of the question “who made it?”, which the CRediT taxonomy answers. A dataset’s intellectual work is recorded on the associated paper through roles such as Data curation and Investigation, while the licence governs downstream reuse of the artefact itself. Used together — a clear licence on the data and clear contribution roles on the people — they ensure both the dataset and its creators are properly accounted for.

    Where shared vocabulary fits

    “CC0”, “CC BY”, “public domain”, “attribution”, and “reuse” are interpreted differently across repositories and funders, which undermines the very interoperability that licensing is meant to enable. A shared, federated vocabulary that defines these terms precisely — pointing back to Creative Commons for the licences and to the FAIR principles for the reusability requirement — is what lets a licence chosen for one repository be understood correctly in another. Supplying that definitional layer is the role the CASRAI dictionary is designed to play; the relevant terms sit in the data-infrastructure domain.

    Related reading