Tag: coara vs dora

  • Dimensions Altmetrics, Scopus & Web of Science: A DORA-Aligned Comparison

    Dimensions altmetrics, Scopus CiteScore, and Web of Science’s Impact Factor answer different questions about the same paper: how much online attention it attracted, how its journal’s four-year citation average compares, and how its two-year citation count compares against a curated index. No single number from any one database satisfies the San Francisco Declaration on Research Assessment (DORA)’s call for multi-indicator, qualitative-plus-quantitative evaluation — which is why research offices increasingly triangulate across all three.

    A citation database is a structured index of scholarly publications and their citation links, used to measure research coverage, impact, and attention across disciplines. Dimensions, Scopus, and Web of Science each build that index differently, and the differences matter directly for institutions trying to run dimensions altmetrics-aware, DORA-compliant assessment rather than single-metric ranking.

    How does coverage differ across Dimensions, Scopus and Web of Science?

    Coverage breadth is the single biggest structural difference between the three databases, and it is measurable rather than a matter of opinion. A 2021 Scientometrics study by Singh, Singh, Karmakar, Leta and Mayr found that Dimensions indexes 82.22% more journals than Web of Science and 48.17% more journals than Scopus, largely because Dimensions ingests preprints, grants, patents, clinical trials, and policy documents alongside conventional journal articles.

    A separate large-scale comparison published in Quantitative Science Studies (Visser, van Eck and Waltman, 2021, MIT Press) benchmarked Scopus, Web of Science, Dimensions, Crossref and Microsoft Academic together and found that Dimensions and Crossref offer the broadest raw coverage, while Scopus and Web of Science retain more curated, higher-quality affiliation and subject metadata. Web of Science’s Core Collection remains the most selective of the three, with editorial evaluation criteria dating to Eugene Garfield’s 1960 Science Citation Index; Scopus, launched by Elsevier in 2004, applies a comparatively more inclusive Content Selection and Advisory Board process.

    The practical implication: a citation count pulled from only one database will systematically undercount or overcount depending on discipline, document type, and region. A 2020 comparison from the German Kompetenznetzwerk Bibliometrie (Stahlschmidt and Hinze) reached the same conclusion — the three sources are not interchangeable, and cross-checking is a foundational bibliometric hygiene step, not an optional extra.

    What metrics does each database produce?

    Each platform has developed its own headline indicator, and none of the three is a like-for-like substitute for the others.

    Database Owner Headline metric Citation window Altmetrics integration
    Dimensions Digital Science Citation counts + linked Altmetric Attention Score No fixed window; article-level Native — shares parent company with Altmetric
    Scopus Elsevier CiteScore; Field-Weighted Citation Impact (FWCI) via SciVal 4-year rolling window PlumX Metrics
    Web of Science Clarivate Journal Impact Factor (JCR) 2-year window (5-year variant available) Article-level usage counts; expanding via Research Intelligence tools

    CiteScore, introduced by Elsevier in 2016, divides all citations a journal receives in a given year by all documents (not only “citable items”) published in the preceding four years, and is published free of charge — a deliberate contrast with the subscription-gated Journal Impact Factor. Field-Weighted Citation Impact normalises a paper’s citations against the world average for its subject, publication year, and document type, where a score of 1.0 represents parity with the global average; this makes FWCI more field-comparable than a raw citation count. The Altmetric Attention Score, meanwhile, is not a citation metric at all — it is a weighted count of online attention (news coverage, policy documents, X/social posts, Wikipedia references, blogs) that Dimensions surfaces natively because Dimensions and Altmetric are both Digital Science products.

    Which database best supports DORA-compliant, multi-indicator assessment?

    DORA, published in 2012 and now signed by thousands of organisations worldwide, asks institutions to stop using journal-based metrics such as the Impact Factor as a proxy for the quality of an individual researcher’s contributions, and instead to consider the value and impact of all research outputs alongside qualitative peer judgement. The 2015 Leiden Manifesto (Hicks, Wouters, de Rijcke and Rafols, published in Nature) added ten operating principles for responsible metrics use, including that quantitative evaluation should support, not replace, qualitative expert assessment.

    All three database vendors now publicly reference these frameworks, but their practical alignment differs. Digital Science, Dimensions’ parent company, is listed on DORA’s public signatory register, and Dimensions’ native pairing with Altmetric gives assessors an attention-based indicator alongside citations without needing a separate subscription. Elsevier has endorsed the Leiden Manifesto and built CiteScore’s open methodology partly in response to its principles. Clarivate likewise cites the Leiden Manifesto in its own responsible-metrics guidance and has begun layering a “Societal Impact Framework” onto Web of Science Research Intelligence to capture impact beyond citation counts.

    None of the three databases is independently DORA-compliant by design — compliance is a property of how an institution uses the data, not of the database itself. A single Impact Factor, CiteScore, or Altmetric Attention Score used alone to rank individuals contradicts DORA regardless of source. Multi-indicator assessment requires combining citation-based indicators from at least one curated database with attention-based indicators and qualitative peer review — which is precisely why UK funders and the Research Excellence Framework have explicitly excluded journal impact factors from submission guidance since 2014, requiring panel-level qualitative judgement instead.

    Where does OpenAlex fit as an open alternative?

    OpenAlex, launched in 2022 by the non-profit OurResearch as a fully open successor to the discontinued Microsoft Academic Graph, has emerged as the fourth reference point in this comparison. Unlike Dimensions, Scopus, and Web of Science, OpenAlex publishes its entire dataset and API without subscription cost, drawing on Crossref, ORCID, and ROR identifiers for disambiguation rather than proprietary matching.

    OpenAlex does not yet match the curated metadata quality or the established institutional trust of Scopus or Web of Science, and it carries no equivalent to the Altmetric Attention Score. But for institutions constrained by licensing budgets, or for bibliometrics tools built on reproducible, auditable pipelines, OpenAlex is increasingly used as a free cross-check against the commercial databases rather than a replacement for them.

    Answer-first questions

    What is Altmetric a measure of?

    Altmetric measures online attention, not citation impact. It tracks mentions of a research output across news media, policy documents, social platforms, blogs, and Wikipedia, then produces a weighted Attention Score. Because it captures engagement that predates or bypasses formal citation, it is treated as complementary to citation-based indicators, not a replacement for them.

    What counts as a good Altmetric score?

    There is no universal threshold, because Attention Scores vary enormously by field, output type, and publication date. As a rough benchmark, Altmetric itself notes that a score above roughly 20 typically outperforms most tracked outputs, but comparisons are only meaningful against similar papers in the same journal and timeframe, never as an absolute cutoff.

    Is Scopus or Web of Science better for research assessment?

    Neither is unconditionally “better” — Scopus offers broader, more geographically diverse journal coverage with a transparent four-year CiteScore, while Web of Science offers deeper historical coverage back to 1900 and the still-widely-recognised Impact Factor. DORA-aligned assessment favours using both alongside non-citation indicators rather than choosing one as authoritative.

    Implications for research offices

    Research administrators selecting or combining these tools should treat the choice as an assessment-design decision, not a procurement afterthought. Three practical consequences follow directly from the coverage and metric differences above:

    • A researcher’s citation count and h-index will differ meaningfully between Dimensions, Scopus and Web of Science — institutions must specify and disclose which source underlies any reported figure.
    • Attention-based data (Altmetric, PlumX) captures policy and public engagement that citation-only databases miss entirely, which matters for funders assessing societal impact pathways.
    • Free, open sources such as OpenAlex are viable supplementary cross-checks, particularly where licensing cost restricts access to all three commercial platforms.

    Conclusion

    The three databases are converging on responsible-metrics language while remaining structurally distinct in coverage, indicator design, and cost. Institutions that want genuinely DORA-compliant, multi-indicator assessment should treat Dimensions, Scopus and Web of Science as complementary evidence sources — pairing at least one citation database with an attention-based indicator and qualitative peer review — rather than defaulting to whichever single number is easiest to pull from a subscription dashboard.

  • Leiden Manifesto Checklist for Research Offices

    The Leiden Manifesto for Research Metrics sets out ten principles, published as a comment in Nature in 2015, for the responsible use of quantitative indicators in research evaluation. Research offices can convert each principle into a direct audit question, testing whether KPI dashboards, promotion criteria and grant-review rubrics rely on a single metric, ignore field norms, or substitute for qualitative judgement.

    The Leiden Manifesto for Research Metrics is a ten-principle framework for the responsible use of bibliometric and other quantitative indicators in evaluating research, published by Diana Hicks, Paul Wouters, Ludo Waltman, Sarah de Rijcke and Ismael Rafols in Nature on 22 April 2015. It was formulated at the 19th International Conference on Science and Technology Indicators, held in Leiden, the Netherlands, in September 2014, and has since been cited more than 4,000 times, according to Google Scholar’s tracking of the original paper.

    What is the Leiden Manifesto for Research Metrics?

    The Leiden Manifesto is a response to what its authors called “impact-factor obsession” — the tendency of universities, funders and promotion committees to substitute a single number for expert judgement. It does not ban metrics. It requires that quantitative indicators support, rather than replace, informed peer assessment of research quality.

    The manifesto’s home institution is the Centre for Science and Technology Studies (CWTS) at Leiden University, where co-author Paul Wouters served as director. CWTS also produces the CWTS Leiden Ranking, a separate bibliometrics-based university ranking — a distinction research offices should not conflate when citing the source.

    What are the ten principles of the Leiden Manifesto?

    Each principle addresses a specific failure mode observed in metric-driven research assessment. The table below states each principle exactly as published, alongside the practical audit question a research office should ask of its own KPI or promotion framework.

    # Principle (Hicks et al., 2015) Audit question for your office
    1 Quantitative evaluation should support qualitative, expert assessment Does any committee decision rest on a metric alone, with no narrative peer input?
    2 Measure performance against the research missions of the institution, group or researcher Are KPIs generic, or tailored to the unit’s stated mission (teaching-intensive, applied, translational)?
    3 Protect excellence in locally relevant research Does the framework penalise work published in non-English or regionally focused outlets?
    4 Keep data collection and analytical processes open, transparent and simple Can an academic reproduce their own score from publicly documented methodology?
    5 Allow those evaluated to verify data and analysis Is there a formal, timely route to challenge or correct metric data before a decision is made?
    6 Account for variation by field in publication and citation practices Are raw citation counts compared across disciplines without field normalisation?
    7 Base assessment of individual researchers on a qualitative judgement of their portfolio Does promotion criteria require a portfolio narrative, or just an h-index threshold?
    8 Avoid misplaced concreteness and false precision Are decimal-point differences in impact factor or citation rate treated as meaningful?
    9 Recognise the systemic effects of assessment and indicators Has the office assessed whether its KPIs create incentives to game submission counts or venues?
    10 Scrutinise indicators regularly and update them Is there a scheduled review cycle for the KPI framework itself, not just for scores against it?

    How can a research office audit its KPI and promotion framework against it?

    Running the manifesto as a live audit tool means working through each principle against real artefacts: the appraisal form, the promotion rubric, and the departmental dashboard.

    1. Mark every clause in the promotion/tenure criteria naming a specific metric (impact factor, h-index, citation count).
    2. Check each marked clause has a qualitative narrative requirement alongside it (Principles 1 and 7).
    3. Confirm KPI targets are set per unit mission, not copied institution-wide (Principle 2).
    4. Check non-English-language or applied outputs score on the same scale as high-impact-journal outputs (Principle 3).
    5. Verify each dashboard metric’s data source and calculation method is documented and accessible (Principles 4 and 5).
    6. Confirm citation indicators are field-normalised, not raw counts compared across disciplines (Principle 6).
    7. Look for false precision — ranking staff by two-decimal citation averages (Principle 8).
    8. Ask whether the KPI framework has driven any unintended behaviour, such as salami-slicing publications or discouraging risky research (Principle 9).
    9. Set a fixed review date for the framework itself, independent of individual appraisal cycles (Principle 10).

    A framework that fails more than two or three of these checks is not aligned with the manifesto, regardless of how sophisticated its dashboard software looks. The most common failure in practice is Principle 6: comparing raw citation counts across a mathematics department and a cell biology department, where top-ranked mathematics journals carry impact factors around 3 while top-ranked cell biology journals carry impact factors around 30 — a field-scale gap the manifesto’s authors cite directly as evidence that uncorrected cross-field comparison is meaningless.

    How does the Leiden Manifesto compare with DORA and CoARA?

    The Leiden Manifesto did not appear in isolation. The 2013 San Francisco Declaration on Research Assessment (DORA) preceded it, while the Coalition for Advancing Research Assessment (CoARA) has since built a sector-wide agreement on reforming assessment practice. Research offices are frequently asked which one to adopt.

    Framework Published Format Primary focus
    Leiden Manifesto 22 April 2015 (Nature comment) 10 principles Correct use of quantitative indicators across disciplines and settings
    DORA 2013 (San Francisco Declaration) General recommendations + signatory pledge Eliminating journal impact factor as a proxy for article or researcher quality
    CoARA 2022 (Agreement on Reforming Research Assessment) Institutional commitment agreement Sector-wide reform of hiring, promotion and funding assessment criteria

    DORA has been signed by more than 27,000 individuals and organisations, according to DORA’s own published tally as of March 2026, making it the higher-profile pledge. But when Loughborough University’s LIS-Bibliometrics committee chose a framework for its own policy in 2018, policy manager Elizabeth Gadd selected the Leiden Manifesto because it takes a “broader approach to the responsible use of all bibliometrics across a range of disciplines and settings” — not only journal-level metrics. Elsevier separately announced on 14 July 2020 that it would use the manifesto’s principles to guide its CiteScore methodology.

    In the UK, the independently commissioned Metric Tide review (2015), led by James Wilsdon for the then Higher Education Funding Council for England, reached compatible conclusions and recommended metrics support, not replace, peer review within the research administration processes underpinning the Research Excellence Framework. A research office building a REF-adjacent KPI policy should treat the two as aligned, not competing, references.

    Common questions and what comes next for research offices

    Who wrote the Leiden Manifesto for Research Metrics?

    The manifesto was written by Diana Hicks, professor of public policy at Georgia Institute of Technology, and Paul Wouters, then director of CWTS at Leiden University, together with co-authors Ludo Waltman, Sarah de Rijcke and Ismael Rafols. It was published as a comment in Nature, volume 520, on 22 April 2015.

    Does the Leiden Manifesto ban the use of bibliometrics tools?

    No. The manifesto does not prohibit bibliometrics tools such as Web of Science, Scopus or Dimensions. It requires that any output from these tools — citation counts, h-indices, journal metrics — be interpreted alongside qualitative expert review and adjusted for field-specific citation norms before it informs a decision.

    Why does the importance of bibliometrics remain contested?

    Bibliometrics matter because they scale evaluation across thousands of researchers where individual peer review is impractical. The contested part is misuse: treating a single indicator as an objective proxy for quality, rather than one input alongside portfolio review, mission fit and field context, as the manifesto’s ten principles specify.

    How often should a research office review its KPI framework under the manifesto?

    Principle 10 requires indicators to be “scrutinised regularly and updated,” but sets no fixed interval. Good institutional practice, reflected in library and research-office guidance built on the manifesto, is an annual technical review of data sources plus a full policy review on the same three-to-five-year cycle as promotion-criteria revisions.

    The Leiden Manifesto’s ten principles were written as durable evaluation ethics, not a one-time compliance exercise. As institutions layer AI-assisted analytics, altmetrics and funder-mandated open-data reporting onto existing KPI frameworks, the manifesto’s core requirement — that quantitative evaluation support, not replace, expert judgement — becomes harder to satisfy by default and more important to audit deliberately. Research offices that build the checklist above into their annual promotion-criteria review cycle, rather than treating the manifesto as background reading, are the ones actually applying it.

  • Is Self-Citation Ethical in Responsible Metrics?

    Is self-citation ethical? Self-citation is ethical when an author cites their own prior work because it is genuinely relevant to a new argument, method, or dataset; it becomes unethical only when the primary motive shifts to inflating citation counts, h-index, or a journal’s impact factor. Neither DORA nor CoARA — the two dominant responsible-metrics frameworks — sets a self-citation rule, leaving this judgement almost entirely to editors, reviewers, and individual conscience.

    Self-citation is the practice of an author referencing their own previously published work within a new publication, most commonly to establish methodological continuity, avoid self-plagiarism, or trace the development of a research programme over time.

    What counts as self-citation, and why do researchers do it?

    Self-citation occurs whenever an author lists their own prior publication in a new paper’s reference list. It is neither rare nor inherently suspect: most research is cumulative, and a study that builds on a researcher’s earlier method, dataset, or theoretical framework has good reason to cite that earlier work directly.

    • Establishing methodological continuity with a previously validated technique or instrument
    • Avoiding self-plagiarism by properly attributing earlier text, data, or ideas
    • Tracing the trajectory of a multi-paper research programme for the reader
    • Providing background the author is best placed to cite because they generated the original finding

    The Committee on Publication Ethics (COPE) has noted that failing to cite one’s own directly relevant prior work can itself mislead readers into thinking a study is more novel than it is — so the ethical failure mode runs in both directions, not only toward over-citation.

    How much self-citation is considered excessive?

    There is no single, universally agreed self-citation rate ceiling. A 2023 analysis published in PMC concluded that a self-citation rate around 20 percent is conservatively tolerable for individual researchers, with rates substantially above that treated as inappropriate — but the same paper stresses that discipline size and publication norms shift what counts as normal.

    COPE’s own November 2017 forum discussion, “Self-Citation: Where’s the Line?”, found no consensus figure among editors. Some journals cap the absolute number of self-citations (for example, no more than five), others use a percentage-of-total-references ceiling, and many rely on case-by-case editorial judgement rather than a fixed rule. COPE’s broader position on handling citation manipulation asks journals to set their own thresholds and educate authors, rather than prescribing one number for the whole of scholarly publishing.

    A 2025 analysis in the Journal of Academic Ethics (Springer) reinforces the intent-based test over a rate-based one, concluding that “ethical reviewers should avoid unnecessary self-citation” while allowing that citing one’s own work is acceptable “if directly relevant” — the same relevance-over-frequency logic COPE applies.

    Why don’t DORA and CoARA address self-citation directly?

    The San Francisco Declaration on Research Assessment (DORA, 2012) is aimed squarely at eliminating the use of the journal impact factor as a proxy for individual researcher quality in hiring, funding, and promotion decisions. It says nothing about how many times an author may cite themselves within a paper’s reference list — that is a citation-practice question, not a journal-metric question, and sits outside DORA’s original scope.

    The Coalition for Advancing Research Assessment (CoARA), formed in 2022, commits signatory institutions to move away from inappropriate use of quantitative indicators and toward qualitative, narrative-based evaluation. This is the closest thing academia has to a responsible-metrics consensus position, yet CoARA’s Agreement likewise does not name self-citation as a distinct risk category — it addresses metric misuse at the institutional and assessment level, not individual reference-list behaviour.

    The result is a genuine governance gap. Self-citation sits between two policy domains — publication ethics (COPE’s territory) and research assessment reform (DORA and CoARA’s territory) — without either treating it as a first-class concern. Editors are left applying inconsistent journal-level rules, while institutional assessment reformers focus almost entirely on how metrics are used rather than on what feeds into them.

    Disclosure norms vs blanket caps: the better governance model

    A blanket percentage cap on self-citation is easy to state but poorly matched to how research actually varies. Small or emerging subfields with few active authors, first-in-series methodology papers, and long-running research programmes will all show naturally higher self-citation rates than a large, well-established field — penalising a rate rather than the intent behind it risks punishing legitimate continuity while doing little to stop a determined metric-gamer, who can simply keep self-citations just under whatever line is drawn.

    A more workable precedent already exists in bibliometrics. The standardized citation-metrics database maintained by Ioannidis, Boyack, and Baas — used to identify the world’s most-cited scientists across disciplines — reports each author’s composite citation score both with and without self-citations included, alongside their raw self-citation percentage. It does not impose a cutoff; it makes the number visible and lets the reader judge. That is a disclosure model, not a cap.

    Framework Year Position on self-citation Governance model
    COPE 2017/ongoing Case-by-case editorial judgement; no fixed universal threshold Journal-level policy, editorial discretion
    DORA 2012 Not addressed; targets impact-factor misuse in assessment Institutional assessment reform
    CoARA 2022 Not addressed; targets inappropriate metric use generally Institutional assessment reform
    Ioannidis/Boyack/Baas database 2019, updated annually Reports self-citation rate transparently alongside adjusted score Disclosure, no cap
    Individual journal caps Varies Fixed number or percentage limit on self-citations Blunt rule, inconsistently applied

    Applying that same logic to individual authors and grant applicants is straightforward: require a disclosed self-citation rate alongside any citation-based metric submitted for hiring, promotion, or funding decisions, rather than an arbitrary cap that cannot distinguish a legitimate methods lineage from deliberate metric inflation.

    Answer-first Q&A on self-citation ethics

    Is self-citation unethical?

    Self-citation is not inherently unethical. It becomes ethically problematic only when it is used to inflate citation metrics rather than to serve genuine scholarly continuity — what COPE treats as a form of citation manipulation. Relevance to the argument, not frequency, is the ethical test that matters.

    Is it okay to cite yourself in a research paper?

    Yes. Citing your own prior work is standard practice when it establishes methodological continuity, avoids self-plagiarism, or shows how a study builds on earlier findings. Problems arise only when self-citations serve no argumentative purpose beyond raising an author’s h-index or a journal’s impact factor.

    Is self-citation illegal?

    No. Self-citation is a matter of publication ethics, not law. Excessive or irrelevant self-citation can breach a journal’s editorial policy or COPE’s citation-manipulation guidance, potentially triggering a correction or editorial inquiry, but it carries no legal liability in any jurisdiction.

    Implications for journals, funders, and institutions

    Journals can adopt the disclosure model directly: require authors to report a manuscript’s self-citation percentage at submission, alongside a one-line rationale where the rate is unusually high, rather than enforcing an arbitrary cap during peer review.

    CoARA signatories reforming promotion and funding criteria are well placed to extend their existing move toward narrative CVs by asking applicants to disclose self-citation-adjusted metrics alongside any citation count submitted for assessment — consistent with CoARA’s broader commitment to context over raw indicators.

    DORA signatories evaluating individual researchers already commit to judging research on its own merits rather than by journal-level proxies; adding a self-citation disclosure line to that practice would close a gap the original 2012 declaration was never designed to cover.

    Conclusion: toward transparent, not punitive, norms

    Self-citation is not a solved problem in responsible metrics guidance — it is an unaddressed one. DORA targets journal-level metric misuse; CoARA targets institutional assessment culture; COPE offers editorial case law without a universal rule. None of the three treats individual self-citation disclosure as a named requirement.

    The fix does not need a new blanket percentage cap, which would misfire across disciplines of different sizes and publication norms. It needs a disclosure norm: report the self-citation rate, report the rationale where it is high, and let editors, funders, and hiring committees judge intent with that information in hand — the same logic that already underpins the field’s most credible standardized citation databases.

  • SciVal Bibliometrics vs the Leiden Ranking: Benchmarking Under DORA

    SciVal is Elsevier’s Scopus-based platform for benchmarking research output; the CWTS Leiden Ranking is Leiden University’s field-normalised ranking that deliberately avoids one composite score. Institutions increasingly run both together, but DORA warns that any league-table framing can reduce research quality to a single misleading number.

    SciVal bibliometrics refers to the citation and output metrics — including Field-Weighted Citation Impact (FWCI) — that Elsevier’s SciVal platform generates from Scopus data to support institutional research evaluation. Research offices now routinely pair this proprietary layer with the CWTS Leiden Ranking’s open, transparent indicators, creating a benchmarking workflow that sits in direct tension with the San Francisco Declaration on Research Assessment (DORA).

    What is SciVal and what does it measure?

    SciVal is Elsevier’s research-analytics platform, built on Scopus abstract-and-citation data, that lets subscriber institutions benchmark output, impact, and collaboration against named peer groups. It does not produce publicly indexed rankings; access is by institutional subscription, and outputs are configured per user for internal decision-making rather than public comparison.

    Core SciVal modules include:

    • Overview — publication and citation summaries for an entity over time
    • Benchmarking — side-by-side comparison against selected competitor or aspirational institutions
    • Collaboration — network maps of co-authorship at institutional, national, and international level
    • Trends — topic-level growth signals used for strategic investment decisions

    Its signature indicator is Field-Weighted Citation Impact (FWCI), the ratio of citations a set of publications actually received to the citations expected for publications of the same type, year, and subject field. A FWCI of 1.0 represents the world average for that field; values above 1.0 indicate above-average citation impact.

    How does the CWTS Leiden Ranking differ from SciVal?

    The CWTS Leiden Ranking, produced annually since 2007 by the Centre for Science and Technology Studies at Leiden University, is a free, publicly available ranking that explicitly refuses to combine indicators into one overall score. Instead it publishes separate, field-normalised tables — including MNCS (mean normalised citation score) and PP(top 10%), the proportion of an institution’s output among the world’s most-cited 10% of papers in its field.

    Where SciVal is a private diagnostic tool tuned to whatever comparator group an institution chooses, the Leiden Ranking is a public, methodologically documented instrument built for cross-institutional transparency. The distinction matters for governance: SciVal data informs internal strategy conversations, while Leiden Ranking data is citable externally by journalists, funders, and prospective students.

    Dimension SciVal CWTS Leiden Ranking
    Underlying data source Scopus Web of Science (Classic edition) or OpenAlex (Open Edition)
    Access model Institutional subscription Free and publicly browsable
    Composite score Configurable dashboards, no single mandated score Explicitly none — indicators kept separate by design
    Level of analysis Author, department, institution, custom groups Institution-level only
    Signature indicator Field-Weighted Citation Impact (FWCI) MNCS and PP(top 10%)
    Governing body Elsevier (commercial) CWTS, Leiden University (academic)

    Why does DORA caution against benchmarking with league tables?

    DORA, the San Francisco Declaration on Research Assessment published in 2012, calls on institutions to stop using journal- and rank-based proxies as substitutes for assessing the actual content of research. Its core recommendation is definitive: evaluators must not treat a journal impact factor, or by extension a university’s league-table position, as a surrogate measure of the quality of an individual researcher’s contribution.

    The UK’s Research Excellence Framework reinforces the same principle domestically — REF guidance instructs assessment panels not to rely on journal impact factors or bibliometric rankings when judging individual outputs. A single Leiden Ranking position or SciVal FWCI score, DORA argues, compresses genuinely multidimensional research performance into one figure that is easy to misuse in hiring, promotion, and funding decisions.

    How are research offices combining SciVal and Leiden in practice?

    A DORA-conscious workflow uses SciVal for granular internal diagnostics and the Leiden Ranking for transparent, external context — never letting either stand alone as a judgement on individual quality. In practice this looks like a two-stage process rather than a single dashboard export.

    1. Research offices first use SciVal to identify departmental strengths, emerging topics, and collaboration gaps against a self-selected comparator set.
    2. They then check institutional standing against the Leiden Ranking’s published, field-normalised indicators to see how that internal picture holds up against an independently governed, public dataset.
    3. Neither output is applied directly to an individual researcher’s promotion or tenure case, consistent with DORA’s requirement that assessment be based on the substance of the work.

    This “basket of metrics” approach — pairing a proprietary analytics tool with an open, non-composite ranking — is increasingly the model that DORA-signatory universities describe in their own research-assessment policies.

    What does the OpenAlex-based Leiden Ranking Open Edition change?

    Since 2023, CWTS has published a Leiden Ranking Open Edition built entirely on OpenAlex data, run alongside the long-standing Web of Science-based Classic edition. OpenAlex, launched by OurResearch in 2022 as a free successor to the discontinued Microsoft Academic Graph, indexes a broader and more open set of scholarly outputs than either Scopus or Web of Science.

    Because the Open Edition and Classic edition draw on different underlying databases, the same institution can show a materially different position depending on which edition is consulted — a fact rarely mentioned in library guidance on SciVal or Leiden alone. This is itself a practical argument for DORA’s caution: even among ostensibly objective, field-normalised rankings, the choice of data source alone can shift an institution’s apparent standing, before any interpretive judgement is applied.

    Common questions about SciVal bibliometrics

    Is SciVal the same as Scopus?

    No. Scopus is Elsevier’s underlying abstract-and-citation database; SciVal is a separate analytics layer built on top of Scopus data. Scopus supplies the raw publication and citation records, while SciVal turns them into benchmarking dashboards, Field-Weighted Citation Impact scores, collaboration maps, and trend reports for institutions and funders.

    What is SciVal used for?

    Research offices use SciVal to benchmark departments against named peers, track Field-Weighted Citation Impact and output trends, identify emerging research strengths, map collaboration networks, and build evidence for grant applications — functions distinct from external, public rankings such as the Leiden Ranking.

    What are the limitations of SciVal?

    SciVal’s field-normalisation depends on how Scopus classifies each publication’s subject field, which can misclassify interdisciplinary work. Coverage is limited to Scopus-indexed output, under-representing books and some social-science and humanities journals — a gap DORA cites when warning against treating any single metric as definitive.

    What metrics does SciVal provide?

    Core SciVal indicators include Scholarly Output, Citation Count, Field-Weighted Citation Impact (world average equals 1.0), Outputs in Top Citation Percentiles, and Collaboration metrics. These sit alongside Leiden-style indicators such as MNCS and PP(top 10%) used for external, field-normalised comparison.

    What this means for research administrators

    For research administration teams, the practical guidance is to treat SciVal and the Leiden Ranking as complementary diagnostic inputs, not verdicts. Any institutional report that cites either should disclose the comparator group, data source (Scopus, Web of Science, or OpenAlex), and the field-normalisation method applied, so that governance committees can judge the figures in context rather than as a rank alone.

    Where SciVal or Leiden data feeds into funding, hiring, or strategic planning, DORA-aligned institutions pair the quantitative output with qualitative peer assessment — a practice increasingly documented in the research-assessment policies of DORA-signatory universities.

    Where institutional benchmarking is heading

    As open bibliographic sources such as OpenAlex mature alongside proprietary platforms, expect research offices to triangulate across multiple data sources rather than anchor decisions to one dashboard or one ranking position. The direction of travel — visible in the Leiden Ranking’s own move to publish a parallel OpenAlex edition — is toward more transparent, multi-source benchmarking, precisely the “basket of metrics” model DORA has argued for since 2012.

    Research offices that document their methodology and keep SciVal, Leiden, and open datasets in dialogue with each other will be better placed to withstand scrutiny than those relying on any single proprietary score.

  • OpenAlex: The Case for Open Research Metrics

    OpenAlex is a free, CC0-licensed index of more than 319 million scholarly works, authors and institutions, built by the non-profit OurResearch to replace the discontinued Microsoft Academic Graph. For institutions weighing research-metrics platforms, its open data answers a question closed commercial indices cannot: who can audit the numbers behind an assessment decision.

    OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open-access mode, named after the Library of Alexandria. That single design choice — publishing the full dataset under a public-domain licence rather than behind a subscription wall — is what separates it structurally from Elsevier’s Scopus and Clarivate’s Web of Science, and why it has become a reference point in debates about research-assessment transparency.

    What Is OpenAlex?

    OpenAlex launched in January 2022, built by OurResearch (a US non-profit operating as Impactstory, Inc.) as a successor to the Microsoft Academic Graph, which Microsoft stopped updating on 31 December 2021. The project inherited MAG’s dataset and rebuilt it as an open, queryable graph of works, authors, institutions, funders, and topics.

    Two design decisions define the platform. First, the entire dataset is released under a Creative Commons Zero (CC0) licence, meaning any institution, developer, or researcher can download, redistribute, and build on it without permission or cost. Second, OpenAlex has formally adopted the Principles of Open Scholarly Infrastructure (POSI), a governance commitment covering sustainability, community control, and data portability.

    The scale is now substantial. OpenAlex’s own catalogue reports more than 319 million scholarly works, and its API handled roughly 115 million queries a month in 2024, according to figures cited in the platform’s Wikipedia entry. It draws source data from Crossref, ORCID, DOAJ, and Unpaywall rather than from a closed editorial pipeline.

    How Does OpenAlex Compare with Scopus and Web of Science?

    The practical difference is not just price — it is what each platform lets an institution verify. Scopus and Web of Science apply proprietary, selective journal-inclusion criteria and sell access to the resulting index. OpenAlex indexes broadly by default and publishes the inclusion logic as open code, which means an institution can inspect exactly why a work is or is not counted.

    Dimension OpenAlex Scopus (Elsevier) Web of Science (Clarivate)
    Governance Non-profit (OurResearch), POSI-aligned Commercial publisher Commercial data company
    Data licence CC0, fully open, bulk download Proprietary, licensed access only Proprietary, licensed access only
    Core journal metric No proprietary journal metric CiteScore (four-year citation average) Journal Impact Factor
    Coverage approach Broad, automated aggregation, strong Diamond OA and non-English coverage Curated, selective journal list Curated, selective journal list
    Cost to institutions Free API; optional paid support tier Subscription Subscription

    CiteScore, Scopus’s flagship journal metric, averages the citations a journal’s documents receive over a four-year window — a useful signal, but one calculated entirely inside a closed system that institutions cannot independently reproduce. OpenAlex does not publish an equivalent branded journal score; instead it exposes the underlying citation and work-level data so that any bibliometrician can calculate their own indicator and show their working.

    Coverage differences matter for equity as much as accuracy. A 2024 study cited in OpenAlex’s Wikipedia entry found the platform indexes more than 12,500 Diamond Open Access journal titles, including over 60% of Diamond OA journals absent from both Web of Science and Scopus — a direct consequence of not gating inclusion behind a commercial selection committee.

    Why Does Open Metrics Infrastructure Serve DORA’s Transparency Principle?

    The San Francisco Declaration on Research Assessment (DORA), first published in 2012, asks funders, institutions, and publishers to stop substituting journal-based proxies for direct evaluation of research and to be explicit about the criteria used in funding, hiring, and promotion decisions. That explicitness requirement is where the platform choice stops being neutral.

    A closed index can tell an institution that a number was calculated a certain way, but it cannot let that institution independently verify how, because the underlying citation graph is licensed, not published. An open metadata layer removes that opacity: the same dataset an institution cites in a tenure file or a funding report can be downloaded, re-run, and checked by anyone, including the researcher being assessed.

    Adoption evidence has followed the argument. Leiden University announced in September 2023 that it would produce an open-source edition of its CWTS Leiden Ranking using OpenAlex data from 2024 onward. Sorbonne University announced in December 2023 that it was withdrawing its Scopus subscription in favour of OpenAlex. In 2024, France’s Ministry of Higher Education and Research pledged financial support to the project, describing it as “crucial open science infrastructure,” and the Arcadia Fund awarded OurResearch a $7.5 million grant explicitly to build OpenAlex into a sustainable alternative to commercial citation indices.

    • Leiden University: open-source CWTS Leiden Ranking edition built on OpenAlex data (from 2024)
    • Sorbonne University: Scopus subscription withdrawn in favour of OpenAlex (December 2023)
    • French Ministry of Higher Education and Research: financial commitment to OpenAlex as open science infrastructure (2024)
    • Arcadia Fund: $7.5 million grant to OurResearch for OpenAlex sustainability (March 2024)

    None of this means closed indices lack value; their curated selection and mature analytics tooling still suit some high-stakes evaluations. But where the explicit requirement is transparency rather than convenience, an auditable, CC0-licensed data layer meets DORA’s stated principle more directly than a licensed black box.

    Common Questions About OpenAlex

    What is OpenAlex used for?

    Universities, funders, and publishers use OpenAlex to track publication output, measure open-access status, benchmark institutional performance, and feed alternative rankings such as the open-source CWTS Leiden Ranking. Its free API also underpins third-party dashboards, systematic-review tools, and research-information systems that need citation and affiliation data without a subscription fee.

    Is OpenAlex legit?

    Yes. OpenAlex is maintained by OurResearch, a non-profit with a multi-year record of building open scholarly infrastructure, and it has formally adopted the Principles of Open Scholarly Infrastructure (POSI). Its data and methodology are openly licensed and auditable, and the platform is already cited in peer-reviewed scientometrics research, including a 2022 arXiv paper by its founders.

    Is OpenAlex free?

    Yes. The full dataset is released under a Creative Commons Zero (CC0) public-domain licence, and the REST API can be queried without a subscription, unlike Scopus or Web of Science. A polite-pool rate limit applies to unauthenticated use, and OurResearch offers an optional paid support tier for high-volume institutional queries.

    Who owns OpenAlex?

    OpenAlex is created and maintained by OurResearch, a US-based non-profit operating as Impactstory, Inc., not by a commercial publisher. Governance sits with a mission-driven organisation rather than a shareholder-owned company — the structural distinction that underpins its CC0 licensing and its appeal to institutions pursuing publisher-independent, DORA-aligned metrics.

    What Should Institutional Leaders Do Next?

    Platform choice is now a governance decision, not just a procurement one. An institution that cites OpenAlex data in a promotion case, a funding report, or an open-access dashboard is making a transparency claim as well as a metrics claim, and that claim should be tested before it is relied upon.

    • Map which existing assessment workflows (tenure, funding reports, rankings submissions) rely on a metric an evaluator cannot independently reproduce.
    • Pilot OpenAlex alongside — not instead of — existing subscriptions, comparing coverage gaps directly against Scopus or Web of Science outputs for your own institutional corpus.
    • Document data provenance explicitly in assessment criteria, consistent with DORA’s requirement for stated, auditable methodology.
    • Track POSI-aligned infrastructure commitments (OpenAlex, CrossRef, ORCID, ROR) as the durable layer beneath any commercial tool an institution also chooses to license.

    Open, non-proprietary metadata will not replace every function a commercial index performs today. But as funders and assessment reformers keep pressing for auditable evidence over proprietary scores, institutions that already understand — and can reproduce — their own metrics will be the ones best placed to defend them.

  • CiteScore vs Impact Factor Under DORA and CoARA

    In the citescore vs impact factor comparison, neither metric wins under research-assessment reform: CiteScore (Elsevier/Scopus) tracks citations across a four-year window and all document types, while Journal Impact Factor (Clarivate/Web of Science) uses a two-year window limited to “citable items” — and DORA and CoARA both instruct assessors not to use either as a proxy for research quality.

    CiteScore is Elsevier’s Scopus-based journal metric, calculated by dividing the citations a title receives in a calendar year by the number of documents it published across the preceding four years. Journal Impact Factor (JIF) is Clarivate’s older, narrower equivalent, published annually through the Journal Citation Reports (JCR). Both numbers get quoted constantly in tenure files, funding applications and journal marketing — and both are formally out of step with how research-assessment reform now says journals should be judged.

    What Is the Difference Between CiteScore and Impact Factor?

    The core difference is database, window length, and document scope. CiteScore draws on Scopus and counts citations to every document type — articles, reviews, conference papers, book chapters, data papers and editorial material — over a rolling four-year window. Journal Impact Factor draws on Web of Science and restricts its denominator to “citable items” (chiefly research articles and reviews) over a two-year window, even though its numerator counts citations to all document types.

    That asymmetry in JIF’s own formula — a broad numerator over a narrow denominator — is one of the most persistent, well-documented criticisms of the metric, and is a large part of why CiteScore, introduced by Elsevier in December 2016, was built with a wider document scope from the outset.

    Feature CiteScore Journal Impact Factor
    Provider Elsevier Clarivate
    Underlying database Scopus Web of Science (Journal Citation Reports)
    Citation window 4 years 2 years
    Document types counted All document types Primarily “citable items” (articles, reviews)
    Access Free on Scopus journal pages Requires a JCR subscription
    First introduced December 2016 Concept 1955; JCR published annually since 1975

    How Is Each Metric Calculated?

    CiteScore for year Y equals citations received in Y to documents published in Y-3 through Y, divided by the number of documents published across that same four-year span. Elsevier updates a “CiteScore Tracker” monthly, so the figure moves before the annual snapshot is finalised — a transparency feature JIF does not offer.

    Journal Impact Factor for year Y equals citations received in Y to items published in Y-1 and Y-2, divided by the number of “citable items” published in those same two years. Clarivate publishes the finalised figure once a year through the Journal Citation Reports, alongside a JIF quartile ranking within each subject category.

    • Shorter windows (JIF) react faster to hot topics but are more volatile for low-volume or slow-citing fields.
    • Longer windows (CiteScore) smooth out volatility but can undervalue journals in genuinely fast-moving disciplines.
    • Neither window length is “correct” — both were chosen as engineering trade-offs, not as validated proxies for quality.

    What Do DORA and CoARA Say About Journal-Level Metrics?

    The San Francisco Declaration on Research Assessment (DORA), published in 2012 and now signed by tens of thousands of individuals and organisations across more than 160 countries, states that journal-based metrics — explicitly including Impact Factor — should not be used “as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions.” Although DORA’s original text names JIF, the same critique applies directly to CiteScore: both are journal-level averages applied to individual outputs and individual people.

    The Coalition for Advancing Research Assessment (CoARA), launched in 2022 and coordinated with the European University Association, commits its signatories — now numbering hundreds of universities, funders and research organisations — to “abandon inappropriate uses in research assessment of journal- and publication-based metrics, in particular any inappropriate uses of Journal Impact Factor.” CoARA’s Agreement treats CiteScore as falling under the same prohibition, since its ten commitments target the practice of journal-metric substitution for quality judgement, not one specific brand of metric.

    Neither declaration asks institutions to abolish CiteScore or JIF outright. Both ask assessors to stop using either figure as a shortcut for reading, or for judging, the individual piece of work in front of them.

    CiteScore vs Impact Factor: Which Survives Assessment Reform?

    Under DORA and CoARA criteria, neither metric “survives” as a legitimate proxy for individual-level quality — but CiteScore scores better on two specific reform tests: transparency and access. Its underlying Scopus data and monthly tracker are freely visible; JIF’s Web of Science data sits behind a JCR subscription, which is one reason CiteScore is often described as the more auditable of the two.

    Jurisdiction-specific policy already reflects this shift. The UK’s Research Excellence Framework (REF) guidance instructs assessment panels not to use journal-level metrics, including Impact Factor, as a proxy for output quality — panel members are required to read and judge the submitted work itself. Frameworks such as the Leiden Manifesto (2015) and the UK’s Metric Tide review (2015) reach the same conclusion from a different angle: any single citation metric, however calculated, is a partial and gameable signal that needs qualitative context, not a standalone score.

    In practice, most responsible-assessment guidance converges on the same answer: use CiteScore or JIF only as one directional data point about a journal’s citation behaviour — never as a stand-in for peer review, narrative CVs, or discipline-aware qualitative judgement of an individual’s work.

    Common Questions on CiteScore vs Impact Factor

    Which is better, Impact Factor or CiteScore?

    Neither is “better” in absolute terms. CiteScore suits fields with slower citation cycles and full Scopus coverage, while Journal Impact Factor suits comparisons within Web of Science’s narrower, more selective index. Under DORA and CoARA criteria, both are inappropriate substitutes for peer review or individual-level research assessment.

    What is a good CiteScore for a journal?

    A “good” CiteScore is field-relative. Elsevier’s own guidance points assessors toward a journal’s CiteScore Percentile rather than the raw number — a title at the 90th percentile outperforms 90% of journals in its Scopus subject category, which is more meaningful than comparing raw scores across disciplines.

    Is 3.5 a good Impact Factor?

    There is no universal threshold. A 3.5 Impact Factor is strong in fields with slow, sparse citation practices but modest in fast-citing fields such as immunology or oncology. Clarivate’s Journal Citation Reports ranks journals by subject-category quartile, not by a fixed numeric cutoff, for exactly this reason.

    What is a decent CiteScore?

    Elsevier measures this through the CiteScore Percentile: a title in the 96th percentile ranks as high as, or higher than, 96% of journals in its category. Institutions applying DORA principles are advised to cite percentile standing within a discipline rather than treat any single CiteScore value as “decent” in isolation.

    Implications for Institutions and Publishers

    For research administrators, the practical takeaway is procedural, not metric-specific: audit promotion, tenure and funding criteria for language that treats CiteScore or JIF as a quality proxy, and replace it with narrative or portfolio-based evaluation where DORA or CoARA commitments apply — a shift increasingly embedded in research administration standards and workflows. For publishers, transparency about which metric — and which window — is being quoted matters more than which number is higher, since CiteScore and JIF are not interchangeable and a journal can carry a strong figure on one while looking average on the other.

    As more funders and universities formalise CoARA commitments, expect journal-level metrics to persist as directional signals in publisher marketing and library collection decisions, while disappearing — by policy, not by accident — from individual hiring, promotion and grant-review criteria.

  • OECD’s Reforming Research Assessment for Better Science: A 2026 Guide for Research Offices

    The OECD’s 2026 report, “Reforming Research Assessment for Better Science,” concludes that research assessment relying on narrow publication metrics and commercial rankings distorts research culture, and it recommends that institutions cut low-value evaluation, adopt open data infrastructures, and use AI in assessment only with caution. For research offices, the report’s six policymaker recommendations translate into concrete changes to how institutional evaluation criteria, data sourcing, and staff training are run.

    Research assessment is the systematic process of monitoring, evaluating and reviewing research inputs, processes, outputs and impacts, carried out by governments, funders, universities and publishers. The OECD reforming research assessment for better science policy brief — OECD Policy Briefs No. 56, published 29 April 2026 — sets out why that process is misaligned with how science now works, and what research-performing organisations should do about it.

    What is the OECD’s 2026 report on reforming research assessment?

    “Reforming Research Assessment for Better Science” is an OECD Policy Brief (No. 56) published on 29 April 2026 that reviews why current research-assessment practices are misaligned with the evolving nature of science, and sets out six actions for policymakers and institutions. It is accompanied by a longer evidence base, OECD Science, Technology and Industry Working Paper No. 2026/7, “New Expectations and Demands from Science: Rethinking Research Assessment Frameworks,” which maps the actors, tensions and drivers behind the reform movement.

    Both documents are credited to the OECD Directorate for Science, Technology and Innovation, with Frédéric Sgard listed as the named contact. The brief carries the persistent identifier DOI 10.1787/f6202159-en; the working paper carries DOI 10.1787/0c685800-en. Neither document proposes a single replacement metric — instead, both argue for a system-level shift in how, and how often, assessment is conducted.

    Why does the OECD say metrics-based assessment needs reform?

    The OECD argues that heavy reliance on publication counts, citation rates and journal impact factors has produced perverse incentives, including a “publish or perish” culture that rewards quantity over quality. The brief cites peer-reviewed evidence — including Fanelli (2010) on publication bias and Öztürk and Taşkın (2024) on how metric-based evaluation fuels questionable publishing — to support this conclusion.

    Three specific harms are named:

    • High-risk, high-reward research is systematically undervalued because standard indicators cannot capture long-horizon payoff.
    • Transdisciplinary and societally engaged research is poorly captured by discipline-bound, publication-and-citation frameworks.
    • Assessment volume has grown faster than institutional capacity to absorb it, creating what the OECD calls research-assessment fatigue among researchers and administrators alike, a burden previously quantified in Technopolis Group’s 2015 REF Accountability Review.

    The report is equally direct about rankings. National and global university league tables, it states, “should not be used in RA” because they rely on non-transparent proprietary methods, are biased toward STEM subjects and English-language output, and — per the UN University’s Independent Expert Group 2023 Statement on Global University Rankings — can accentuate global, regional and national inequalities.

    What alternative evaluation tools and infrastructures does the OECD recommend?

    The OECD does not prescribe one alternative framework; instead, it maps nine existing international initiatives that research offices can draw on, and it names open, non-proprietary databases such as OpenAlex and Redalyc as viable substitutes for closed commercial data sources. The report’s own comparison table — reproduced and dated below — is the clearest single reference point for institutions deciding which framework to adopt or reference in policy documents.

    Initiative Year Core contribution
    DORA (San Francisco Declaration on Research Assessment) 2012 Discourages journal-based metrics as a proxy for quality; spawned the TARA practical-tools project in 2021
    Leiden Manifesto 2015 Principles and best practice for using quantitative indicators responsibly
    INORMS Research Evaluation Group 2018 SCOPE Framework for Research Evaluation and the “More than Our Rank” initiative
    FOLEC-CLACSO 2019 Regionally specific research-assessment guidelines for Latin America
    Hong Kong Principles 2019 Minimising perverse incentives; rewarding trustworthy research practice
    Science Europe Position Statement 2020 Recommendations on research assessment processes for funders
    CoARA (Coalition for Advancing Research Assessment) 2022 Agreement on Reforming Research Assessment, with global signatories
    Barcelona Declaration 2024 Advocates open research information infrastructure
    Global Research Council RRA Working Group 2024 An 11-dimension framework for responsible research assessment

    The OECD’s own recommendation is not to pick a winner among these, but to “promote sustained dialogue” between them and to have governments recognise alignment with these emerging international principles as a criterion within cyclical institutional assessment exercises.

    What should research offices do differently?

    The report’s six policymaker actions each carry a direct operational counterpart for institutional research offices, from auditing evaluation volume to renegotiating data contracts. Research administrators reading the brief should map each national-level recommendation onto an institutional equivalent:

    • Reduce assessment volume: audit which internal reviews, reports and dashboards serve a “clearly defined objective” — and retire those that do not.
    • Diversify data sources: reduce dependency on single proprietary bibliometric platforms by testing open alternatives such as OpenAlex alongside existing subscriptions.
    • Remove rankings from internal criteria: strip commercial league-table position from promotion, tenure and internal funding-allocation rubrics.
    • Govern AI use cautiously: where AI tools are piloted in peer-review triage or portfolio analysis, require transparent, explainable models and documented human oversight rather than opaque large language models.
    • Invest in staff capacity: the brief is explicit that “guidance, training and capacity building will be key” — senior administrators, librarians and peer reviewers all need structured onboarding to new evaluation frameworks, not just a policy memo.
    • Adopt proportionate methods: match the evaluation method (summative for decisions, formative for development, or a blend) to the actual purpose of each assessment exercise.

    Institutions already engaged with CASRAI’s research administration resources will recognise these as extensions of existing responsible-metrics and open-science commitments rather than a wholesale change of direction.

    Answer-first Q&A

    What is responsible research assessment?

    Responsible research assessment refers to evaluation approaches that incentivise, reflect and reward the plural characteristics of high-quality research rather than relying on narrow proxy metrics such as journal impact factor. It combines qualitative judgement with proportionate, context-appropriate quantitative indicators, following principles set out by DORA, the Leiden Manifesto and CoARA’s 2022 Agreement.

    Why does the OECD discourage the use of rankings in research assessment?

    The OECD states that national and global rankings are marketing tools built on non-transparent proprietary data and methods that are not adapted to different institutions’ profiles or purposes. Because they are biased toward STEM subjects and English-language scholarship, their use in funding or hiring decisions can exacerbate global, regional and national inequalities rather than reflect genuine research quality.

    What role should AI play in research assessment, according to the OECD?

    The OECD says AI’s role in research assessment “needs to be carefully examined” rather than adopted by default. It favours transparent, deterministic models over opaque large language models, requires ex-ante risk assessment and human oversight, and warns that AI licensing costs can quietly increase institutions’ dependency on commercial technology providers.

    How can research offices reduce the burden of research assessment?

    Research offices can reduce burden by evaluating “only what and when necessary,” in the OECD’s words — applying assessment solely where a clearly defined objective exists and a less resource-intensive process would not suffice. Matching evaluation type (summative versus formative) to actual purpose, rather than defaulting to full review, is the report’s core proportionality test.

    What happens next for research assessment reform?

    The OECD frames reform as an iterative, long-term structural transition rather than a one-off policy change, pointing to national experiments already under way as evidence. It cites Rushforth’s 2024 analysis of the Netherlands’ “Recognition and Rewards” programme and China’s institutional hybrid responses (Liang, Zhao and Li, 2024) as examples of top-down signals interacting with bottom-up institutional experimentation.

    Concrete pilots are already generating data: Luxembourg’s National Research Fund reports three years of narrative-CV use as of 2026, and UK researchers have begun assessing generative AI’s potential role ahead of the REF 2029 exercise. For research offices, the practical takeaway is that no single framework will be mandated — institutions that start testing proportionate, criteria-linked alternatives now will be better positioned as national funders and assessment bodies converge around the OECD’s six recommendations.

  • CoARA Action Plan: Reform or Box-Ticking?

    CoARA’s action plan framework requires every signatory to publish, within a year of joining, a time-bound roadmap for reforming its research-assessment criteria, and to show progress at a five-year checkpoint due at the end of 2027. Three years after the Coalition’s November 2022 launch, membership has grown from roughly 100 founding organisations to more than 830 — yet CoARA’s own public tracker shows most signatories have not yet deposited a citable action plan, which is the real test of whether this is reform or box-ticking.

    The CoARA action plan is the documented, time-bound roadmap each Coalition for Advancing Research Assessment signatory must publish, setting out how it will revise the criteria, tools and processes it uses to evaluate research, researchers and research-performing organisations against the Agreement’s core commitments.

    What does the CoARA action plan actually require?

    The Agreement on Reforming Research Assessment (ARRA) obliges signatories to review or develop criteria, tools and processes against ten core commitments, and to record that process as an action plan with defined milestones. Under CoARA’s own guidance, the first plan is due within one year of signing (eighteen months for early signatories), with a further checkpoint at the end of 2027, by which point signatories must have completed at least one full review-and-development cycle.

    Crucially, CoARA imposes no fixed template. Organisations have “full freedom” in how they design their plan, and the Coalition explicitly asks signatories not to duplicate existing responsible-assessment work. That flexibility is defensible for a coalition spanning universities, funders, academies and research infrastructures — but it also means the Coalition has no standard unit for measuring whether commitments are being kept, only a request that plans be deposited publicly via a shared Zenodo collection.

    Has reform reached hiring, promotion and grant criteria?

    Some of the evidence is concrete. Loughborough University’s action plan, deposited in October 2023, embeds existing responsible research assessment practice into formal review criteria rather than treating CoARA as a new bolt-on process. Goldsmiths, University of London published a 2024–2029 plan explicitly tied to promotion and appraisal reform, and the University of Edinburgh deposited an updated plan in 2025 addressing how researchers and research-support staff are evaluated.

    Funders have moved too. Denmark’s Independent Research Fund (DFF) published an updated action plan in May 2025 that tracks delivery status against each commitment — a rare example of a signatory reporting progress rather than just intent. Italy’s national evaluation agency, ANVUR, has a 2024–2027 plan aimed at aligning national research-assessment criteria, not just one institution’s, with CoARA principles.

    These cases show the mechanism can produce real, checkable change in grant review and promotion documentation. The open question is how representative they are of the Coalition as a whole.

    How many signatories have actually filed an action plan?

    CoARA’s own live tracker — “Action Plans: Submitted & Pending to Date” — lists roughly 660 organisation entries with a due date for their first action plan. Of those, only around 136 carry an actual Zenodo DOI, meaning a plan has been deposited and made citable. The remaining entries, including many whose plans were originally due back in October 2023, are still marked “Pending” three years on.

    That is a completion rate of roughly one in five against CoARA’s own one-year deadline. It does not necessarily mean four in five signatories have done nothing internally — some may be reforming quietly without depositing paperwork — but it is the single clearest, most falsifiable indicator CoARA itself publishes, and it currently favours the “declaratory” reading of the Coalition’s progress over the “reformed” one.

    Metric Figure (CoARA live tracker, accessed July 2026)
    Organisation-level action plan entries tracked ~660
    Entries with a deposited, citable action plan (DOI issued) ~136 (≈21%)
    Entries still marked “Pending” ~514 (≈78%)
    Total current CoARA member organisations 834, across 60+ countries

    CoARA vs DORA: does history repeat itself?

    CoARA did not invent the credibility problem it now faces. The San Francisco Declaration on Research Assessment (DORA), launched in 2012 to curb inappropriate use of the Journal Impact Factor, has accumulated more than 27,000 individual and organisational signatures across 174 countries, according to sfdora.org’s own signer registry. Yet studies of research, promotion and tenure documents have repeatedly found continued reliance on journal-based metrics at institutions that formally signed DORA years earlier — a gap between signature and practice that critics now cite as the precedent CoARA risks repeating.

    CoARA’s design tries to close that gap by making the action plan, not the signature, the operative commitment, with a public deposit requirement and a 2027 checkpoint. A 2024 critique circulated on arXiv (Baccini et al.) argued the opposite risk: that shifting assessment toward qualitative, panel-based peer review could trade transparent metric-driven gatekeeping for a less transparent, harder-to-audit equivalent. Both critiques point to the same underlying test — not whether an organisation signs, but whether its actual review paperwork changes.

    Feature DORA (2012) CoARA (2022)
    Core ask Stop using Journal Impact Factor as a proxy for quality in funding, hiring and promotion Ten commitments on qualitative, diverse and open research assessment
    Accountability mechanism Voluntary signature; no mandatory public action plan Mandatory action plan within one year, deposited on Zenodo, checkpoint by end of 2027
    Current scale 27,000+ signatures, 174 countries (sfdora.org) 834 member organisations, 60+ countries (coara.org)
    Documented gap Continued JIF use found in signatory RPT criteria ~78% of due action-plan entries still “Pending” on CoARA’s own tracker

    Common questions about the CoARA action plan

    What is CoARA research?

    The Coalition for Advancing Research Assessment is a membership body of universities, funders, academies and research infrastructures committed to reforming how research, researchers and research-performing organisations are evaluated. It operates under the Agreement on Reforming Research Assessment, signed from November 2022, which sets shared commitments rather than a single enforced standard.

    What are CoARA National Chapters?

    CoARA National Chapters are country- or region-specific groups, such as the chapter for Ireland, that help local signatories interpret the Agreement’s commitments in their own funding, promotion and language context. They provide practical support for drafting action plans and coordinate national-level alignment with funder policy, including engagement with existing metrics guidance such as DORA.

    Is CoARA the same as DORA?

    No. DORA is a narrower 2012 declaration focused specifically on removing inappropriate Journal Impact Factor use from assessment. CoARA is a broader 2022 coalition with ten commitments covering qualitative assessment, output diversity and open science, and it requires a public, time-bound action plan rather than a one-off signature.

    How many organisations have signed CoARA?

    CoARA’s live membership register lists 834 organisations across more than 60 countries as of mid-2026, up from just over 100 at the November 2022 launch. Growth in membership has significantly outpaced growth in verified, publicly deposited action plans over the same period.

    What this means for research administrators

    For institutional leaders and research-administration teams, CoARA membership is not self-certifying reform. Signing the Agreement creates a public commitment; only a deposited, dated action plan against the ten commitments creates an auditable one. Institutions that have not yet filed should treat the gap as reputational exposure, not paperwork.

    • Check whether your organisation’s action plan (if due) has been deposited to the CoARA Zenodo collection, not just drafted internally.
    • Map each commitment against a specific, named change to hiring, promotion or grant-review criteria — not a general statement of intent.
    • Use the 2027 checkpoint as an internal deadline for demonstrating at least one completed review-and-development cycle, in line with the ARRA’s own timeframe.

    Outlook: what would count as proof by 2027?

    CoARA’s five-year touchpoint at the end of 2027 is the moment the “reform or box-ticking” question gets a real answer. If the proportion of signatories with a deposited, dated action plan rises substantially from today’s roughly one-in-five, and if more funders publish delivery-tracked updates in the style of Denmark’s DFF, the declaratory reading weakens. If the Pending column stays this full, CoARA will have reproduced the exact credibility gap DORA has spent over a decade trying to close.