Tag: Section 29A CDPA

  • Text and Data Mining Copyright Exception: The UK’s 2026 Decision

    The UK’s text and data mining copyright exception currently permits copying protected works for computational analysis only where the purpose is non-commercial research, under Section 29A of the Copyright, Designs and Patents Act 1988. In March 2026, the UK government confirmed it will not introduce a broader, opt-out-based exception for commercial AI training, leaving that narrow research carve-out unchanged while commercial text and data mining continues to require a licence.

    Text and data mining (TDM) is the automated analytical process of extracting patterns, structured data, and statistical relationships from digital text, images, and other content — typically the first stage in assembling a corpus to train a machine-learning or generative AI model.

    What is the UK’s text and data mining copyright exception?

    Section 29A of the Copyright, Designs and Patents Act 1988 allows a person with lawful access to a copyright work — for example through a subscription, a library licence, or open web access — to make a copy of that work for the purpose of computational analysis, provided the resulting copy is used solely for non-commercial research. The exception was introduced in 2014, making the UK one of the earliest jurisdictions to legislate specifically for TDM.

    Two features make Section 29A unusually workable for universities. First, it extends to individual researchers, not only to the institutions that employ them. Second, under section 29A(5), any contract term that purports to prevent or restrict this lawful copying is unenforceable — a publisher cannot use its licence terms to override the statutory right. What the exception does not do is authorise commercial use: a university spin-out or an industry-funded lab training a model for eventual commercial deployment sits outside its protection.

    What broader exception did the government propose?

    The UK Intellectual Property Office first floated a much wider TDM exception in 2022, permitting data mining for any purpose, including commercial AI training, with no opt-out for rights holders. That proposal was abandoned in 2023 after sustained opposition from publishers, musicians, and other creative-industry rights holders, who argued it would let AI developers use copyrighted work without consent or payment.

    The government returned to the question in its December 2024 consultation, Copyright and Artificial Intelligence, which ran for ten weeks, from 17 December 2024 to 25 February 2025. This time the model differed: a rights-reservation (“opt-out”) mechanism paired with a broad exception covering unreserved material, alongside new transparency obligations requiring AI developers to disclose what content they use and how they acquire it. The design echoed Article 4 of the EU’s Digital Single Market (DSM) Directive, and the consultation set out three broad directions: strengthen licensing, introduce the opt-out exception, or make no legislative change.

    • Option A — status quo: retain Section 29A unchanged, relying on licensing markets to develop organically.
    • Option B — broad exception with opt-out: allow TDM for any purpose unless a rights holder actively reserves their rights.
    • Option C — licensing-led framework: mandate transparency and collective licensing infrastructure without a new statutory exception.

    What did the UK government decide in March 2026?

    The government confirmed in March 2026 that it will not proceed with a new text and data mining exception. According to The Ivors Academy, which represents songwriters and composers and was among the respondent organisations, 88% of consultation respondents called for stronger copyright protection and licensing rather than a broadened exception. Section 29A therefore remains the operative law: non-commercial research TDM is lawful without a licence, and everything outside that narrow purpose still requires rights-holder permission.

    Rather than legislating a new exception, the government is directing policy into four linked work programmes: digital replicas (deepfake-style AI recreations of a person’s voice or likeness), AI-output labelling, creator control and transparency obligations for AI developers, and support for independent creatives. Legislation on transparency — requiring AI developers to disclose training data provenance — remains under active consideration even though the exception itself has been shelved.

    How does the UK exception compare with the EU’s TDM rules?

    The UK operates a single, narrower exception than the EU, which — under the 2019 DSM Directive — runs two parallel TDM exceptions with different scopes and different rules on opting out.

    Feature UK — Section 29A, CDPA 1988 EU — Article 3, DSM Directive EU — Article 4, DSM Directive
    Permitted purpose Non-commercial research only Scientific research Any purpose, including commercial
    Eligible parties Research institutions and individual researchers Research organisations and cultural heritage institutions Any lawful user
    Opt-out for rights holders None — contract terms restricting it are unenforceable None Yes — rights holders may reserve rights via machine-readable means
    Extends to databases No — copyright works only Yes Yes
    Commercial research covered No Limited, in some readings Yes

    The UK’s December 2024 proposal effectively asked whether the country should import something resembling Article 4 — a broad, opt-out-based exception. Following the March 2026 decision, the UK still has no equivalent, leaving a materially narrower legal basis for commercial AI training than exists across the EU.

    What does this mean for research libraries and AI labs?

    For university libraries and text-mining labs conducting genuinely non-commercial research, the practical position is unchanged: Section 29A continues to authorise corpus-building from lawfully accessed material, and publisher licence terms cannot override that right. Institutions should still document lawful access and non-commercial purpose, since the boundary — not the existence — of the exception is what gets tested.

    For AI developers, commercial spin-outs, and industry-funded research partnerships, the position stays materially harder than under a broad opt-out exception. These organisations must continue to:

    • Secure licences from publishers, image libraries, and other rights holders before using material for training corpora with any commercial application.
    • Track the government’s forthcoming transparency obligations, which may require disclosure of training-data provenance even without a new exception.
    • Watch the EU comparison closely — jurisdictions with broader TDM rights may become more attractive for model training, a concern the government itself raised in 2024.
    • Distinguish clearly between non-commercial research activity (protected) and downstream commercial application of the same corpus (not protected).

    Research administrators managing institutional research administration functions will need to keep licensing and TDM-exception boundaries visible in data-management and AI-use policies, particularly where university-industry collaborations blur the line Section 29A treats as decisive.

    Frequently asked questions

    What is the text and data mining exception in UK copyright law?

    The text and data mining exception is a provision in Section 29A of the Copyright, Designs and Patents Act 1988 that allows lawful users to copy copyright works for computational analysis, but only for non-commercial research. It does not cover commercial AI training, and any contract term restricting this lawful use is unenforceable under section 29A(5).

    Does the UK allow AI companies to train models on copyrighted text without a licence?

    No. Outside the narrow non-commercial research exception, UK copyright law requires AI developers to obtain a licence from rights holders before using protected text, images, or data for training. The government confirmed in March 2026 it will not introduce a broader exception, so commercial text and data mining still needs permission.

    What was the UK government’s opt-out proposal for AI and copyright?

    In its December 2024 consultation, the government proposed letting rights holders reserve their rights (opt out) while introducing a wide exception permitting AI developers to mine unreserved material at scale — modelled loosely on the EU’s Article 4 DSM Directive exception. This proposal was not taken forward.

    How is the UK’s TDM exception different from the EU’s?

    The UK’s single exception (Section 29A) covers only non-commercial research but extends to individual researchers, not just institutions. The EU’s DSM Directive instead runs two exceptions: Article 3 for research organisations and cultural-heritage bodies, and Article 4, a broader opt-out exception the UK ultimately declined to replicate.

    Looking ahead

    The March 2026 decision closes one chapter of the UK’s AI-copyright debate but opens another: a licensing-led, transparency-driven framework built around four active work programmes rather than a single statutory fix. For research libraries, the immediate legal position is stable. For AI labs and commercial partnerships built on UK-hosted training data, the absence of a broad exception means licensing negotiations — not statute — will continue to determine what can lawfully be mined.