Author: MCP Service

  • Leiden Manifesto Checklist for Research Offices

    The Leiden Manifesto for Research Metrics sets out ten principles, published as a comment in Nature in 2015, for the responsible use of quantitative indicators in research evaluation. Research offices can convert each principle into a direct audit question, testing whether KPI dashboards, promotion criteria and grant-review rubrics rely on a single metric, ignore field norms, or substitute for qualitative judgement.

    The Leiden Manifesto for Research Metrics is a ten-principle framework for the responsible use of bibliometric and other quantitative indicators in evaluating research, published by Diana Hicks, Paul Wouters, Ludo Waltman, Sarah de Rijcke and Ismael Rafols in Nature on 22 April 2015. It was formulated at the 19th International Conference on Science and Technology Indicators, held in Leiden, the Netherlands, in September 2014, and has since been cited more than 4,000 times, according to Google Scholar’s tracking of the original paper.

    What is the Leiden Manifesto for Research Metrics?

    The Leiden Manifesto is a response to what its authors called “impact-factor obsession” — the tendency of universities, funders and promotion committees to substitute a single number for expert judgement. It does not ban metrics. It requires that quantitative indicators support, rather than replace, informed peer assessment of research quality.

    The manifesto’s home institution is the Centre for Science and Technology Studies (CWTS) at Leiden University, where co-author Paul Wouters served as director. CWTS also produces the CWTS Leiden Ranking, a separate bibliometrics-based university ranking — a distinction research offices should not conflate when citing the source.

    What are the ten principles of the Leiden Manifesto?

    Each principle addresses a specific failure mode observed in metric-driven research assessment. The table below states each principle exactly as published, alongside the practical audit question a research office should ask of its own KPI or promotion framework.

    # Principle (Hicks et al., 2015) Audit question for your office
    1 Quantitative evaluation should support qualitative, expert assessment Does any committee decision rest on a metric alone, with no narrative peer input?
    2 Measure performance against the research missions of the institution, group or researcher Are KPIs generic, or tailored to the unit’s stated mission (teaching-intensive, applied, translational)?
    3 Protect excellence in locally relevant research Does the framework penalise work published in non-English or regionally focused outlets?
    4 Keep data collection and analytical processes open, transparent and simple Can an academic reproduce their own score from publicly documented methodology?
    5 Allow those evaluated to verify data and analysis Is there a formal, timely route to challenge or correct metric data before a decision is made?
    6 Account for variation by field in publication and citation practices Are raw citation counts compared across disciplines without field normalisation?
    7 Base assessment of individual researchers on a qualitative judgement of their portfolio Does promotion criteria require a portfolio narrative, or just an h-index threshold?
    8 Avoid misplaced concreteness and false precision Are decimal-point differences in impact factor or citation rate treated as meaningful?
    9 Recognise the systemic effects of assessment and indicators Has the office assessed whether its KPIs create incentives to game submission counts or venues?
    10 Scrutinise indicators regularly and update them Is there a scheduled review cycle for the KPI framework itself, not just for scores against it?

    How can a research office audit its KPI and promotion framework against it?

    Running the manifesto as a live audit tool means working through each principle against real artefacts: the appraisal form, the promotion rubric, and the departmental dashboard.

    1. Mark every clause in the promotion/tenure criteria naming a specific metric (impact factor, h-index, citation count).
    2. Check each marked clause has a qualitative narrative requirement alongside it (Principles 1 and 7).
    3. Confirm KPI targets are set per unit mission, not copied institution-wide (Principle 2).
    4. Check non-English-language or applied outputs score on the same scale as high-impact-journal outputs (Principle 3).
    5. Verify each dashboard metric’s data source and calculation method is documented and accessible (Principles 4 and 5).
    6. Confirm citation indicators are field-normalised, not raw counts compared across disciplines (Principle 6).
    7. Look for false precision — ranking staff by two-decimal citation averages (Principle 8).
    8. Ask whether the KPI framework has driven any unintended behaviour, such as salami-slicing publications or discouraging risky research (Principle 9).
    9. Set a fixed review date for the framework itself, independent of individual appraisal cycles (Principle 10).

    A framework that fails more than two or three of these checks is not aligned with the manifesto, regardless of how sophisticated its dashboard software looks. The most common failure in practice is Principle 6: comparing raw citation counts across a mathematics department and a cell biology department, where top-ranked mathematics journals carry impact factors around 3 while top-ranked cell biology journals carry impact factors around 30 — a field-scale gap the manifesto’s authors cite directly as evidence that uncorrected cross-field comparison is meaningless.

    How does the Leiden Manifesto compare with DORA and CoARA?

    The Leiden Manifesto did not appear in isolation. The 2013 San Francisco Declaration on Research Assessment (DORA) preceded it, while the Coalition for Advancing Research Assessment (CoARA) has since built a sector-wide agreement on reforming assessment practice. Research offices are frequently asked which one to adopt.

    Framework Published Format Primary focus
    Leiden Manifesto 22 April 2015 (Nature comment) 10 principles Correct use of quantitative indicators across disciplines and settings
    DORA 2013 (San Francisco Declaration) General recommendations + signatory pledge Eliminating journal impact factor as a proxy for article or researcher quality
    CoARA 2022 (Agreement on Reforming Research Assessment) Institutional commitment agreement Sector-wide reform of hiring, promotion and funding assessment criteria

    DORA has been signed by more than 27,000 individuals and organisations, according to DORA’s own published tally as of March 2026, making it the higher-profile pledge. But when Loughborough University’s LIS-Bibliometrics committee chose a framework for its own policy in 2018, policy manager Elizabeth Gadd selected the Leiden Manifesto because it takes a “broader approach to the responsible use of all bibliometrics across a range of disciplines and settings” — not only journal-level metrics. Elsevier separately announced on 14 July 2020 that it would use the manifesto’s principles to guide its CiteScore methodology.

    In the UK, the independently commissioned Metric Tide review (2015), led by James Wilsdon for the then Higher Education Funding Council for England, reached compatible conclusions and recommended metrics support, not replace, peer review within the research administration processes underpinning the Research Excellence Framework. A research office building a REF-adjacent KPI policy should treat the two as aligned, not competing, references.

    Common questions and what comes next for research offices

    Who wrote the Leiden Manifesto for Research Metrics?

    The manifesto was written by Diana Hicks, professor of public policy at Georgia Institute of Technology, and Paul Wouters, then director of CWTS at Leiden University, together with co-authors Ludo Waltman, Sarah de Rijcke and Ismael Rafols. It was published as a comment in Nature, volume 520, on 22 April 2015.

    Does the Leiden Manifesto ban the use of bibliometrics tools?

    No. The manifesto does not prohibit bibliometrics tools such as Web of Science, Scopus or Dimensions. It requires that any output from these tools — citation counts, h-indices, journal metrics — be interpreted alongside qualitative expert review and adjusted for field-specific citation norms before it informs a decision.

    Why does the importance of bibliometrics remain contested?

    Bibliometrics matter because they scale evaluation across thousands of researchers where individual peer review is impractical. The contested part is misuse: treating a single indicator as an objective proxy for quality, rather than one input alongside portfolio review, mission fit and field context, as the manifesto’s ten principles specify.

    How often should a research office review its KPI framework under the manifesto?

    Principle 10 requires indicators to be “scrutinised regularly and updated,” but sets no fixed interval. Good institutional practice, reflected in library and research-office guidance built on the manifesto, is an annual technical review of data sources plus a full policy review on the same three-to-five-year cycle as promotion-criteria revisions.

    The Leiden Manifesto’s ten principles were written as durable evaluation ethics, not a one-time compliance exercise. As institutions layer AI-assisted analytics, altmetrics and funder-mandated open-data reporting onto existing KPI frameworks, the manifesto’s core requirement — that quantitative evaluation support, not replace, expert judgement — becomes harder to satisfy by default and more important to audit deliberately. Research offices that build the checklist above into their annual promotion-criteria review cycle, rather than treating the manifesto as background reading, are the ones actually applying it.

  • Limitations of Bibliometrics: DORA and CoARA

    Bibliometrics — the statistical analysis of publication and citation data — cannot reliably stand in for research quality on its own: field-specific citation practices, author self-citation, and outright metric gaming all distort single-number scores such as the h-index or Journal Impact Factor. This is the documented evidentiary basis for DORA and CoARA’s push to replace single-score evaluation with qualitative, multi-indicator assessment.

    Bibliometrics is the quantitative study of academic literature — citation counts, publication volume, and derived indices — used as a proxy for scholarly influence. The proxy breaks down whenever a single number is asked to carry the full weight of a quality judgement, which is precisely what large-scale hiring, promotion, tenure, and funding panels have done for decades.

    What is bibliometrics, and why does one score fall short?

    Bibliometric indicators — citation counts, the h-index, the Journal Impact Factor (JIF), and derived composite scores — were built for large-scale, aggregate comparisons, not for judging an individual scholar’s contribution. Bergstrom, West and Wiseman’s 2008 analysis in the Journal of Neuroscience put it plainly: quantitative metrics are poor choices for assessing an individual’s research output compared with the “gold standard” of reading the work and consulting domain experts.

    A single score compresses conflicting dimensions of scholarly value — novelty, rigour, reproducibility, societal reach — into one figure. That compression, not citation data itself, is the structural weakness reform movements target.

    How does field bias distort bibliometric comparisons?

    Citation practices vary sharply by discipline, so raw citation counts cannot be compared across fields. Mathematics and the humanities publish and cite far less frequently than biomedicine, and books and conference proceedings — the dominant outputs in many humanities and computing sub-fields — are tracked inconsistently, or not at all, by Web of Science and Scopus.

    Coverage gaps compound the bias. Indexing databases differ in subject breadth, subject depth, geographic coverage, language coverage, and how far back citation histories extend, so researchers publishing outside the Anglophone, journal-dominant core of a database are systematically under-counted. Belter’s 2015 review in PMC also notes that citation-based indicators require roughly two to three years after publication before they stabilise enough to be considered reliable — a lag that penalises early-career researchers and recent work by design.

    Why does self-citation inflate bibliometric scores?

    Self-citation — an author citing their own prior work — is a normal and often legitimate part of building on a research programme. It becomes a distortion when it is used strategically to inflate an individual’s citation count or a journal’s Impact Factor beyond what independent uptake of the work would justify.

    Clarivate’s Journal Citation Reports has, in past cycles, suppressed the calculated Impact Factor of titles found to display anomalous citation behaviour, including excessive journal self-citation and coordinated “citation stacking” arrangements between journals — a documented, database-level enforcement action against exactly this failure mode. At author level, unusually concentrated self-citation rates are one of the diagnostic flags bibliometricians use when auditing whether a headline citation figure reflects genuine external uptake or engineered inflation.

    Does field-weighted citation impact solve the problem?

    Field-weighted citation impact (FWCI) is a normalised metric — used in tools such as Scopus/SciVal — that adjusts a publication’s citation count against the average for its subject field, publication year, and document type, so that a score of 1.0 represents “as expected” performance for that context. It is a genuine improvement on raw citation counts because it corrects for the field-bias problem described above.

    FWCI does not, however, correct for self-citation gaming or database coverage gaps, and it remains a single number: it shows how a paper performed against a benchmark, not whether the research was rigorous or original. Reform frameworks treat field normalisation as a refinement of bibliometrics, not a licence to keep using any single indicator as a proxy for quality.

    What evidence underlies DORA and CoARA’s reform case?

    The San Francisco Declaration on Research Assessment (DORA), launched in 2012, explicitly recommends against using the Journal Impact Factor as a surrogate measure of the quality of individual research articles, and calls on institutions to assess research on its own merits using a range of qualitative and quantitative indicators. The Coalition for Advancing Research Assessment (CoARA), formed in 2022, builds on DORA’s diagnosis: its signatories commit to basing assessment primarily on qualitative, peer-reviewed judgement, supported by responsible — not exclusive — use of quantitative indicators, and to abandoning inappropriate use of journal- and publication-based metrics such as the JIF and h-index.

    Both build directly on the failure modes above: field bias, self-citation gaming, database coverage gaps, and the two-to-three-year reliability lag are the documented evidence, not abstract principle, behind the push for reform.

    Initiative Launched Core commitment
    DORA (San Francisco Declaration on Research Assessment) 2012 Stop using the Journal Impact Factor as a proxy for individual article or researcher quality
    Leiden Manifesto 2015 (Hicks et al., Nature 520, 429–431) Ten principles for the responsible, transparent use of quantitative indicators alongside expert judgement
    CoARA (Coalition for Advancing Research Assessment) 2022 Base assessment primarily on qualitative peer review; abandon inappropriate JIF/h-index use in hiring, promotion and funding decisions

    Answer-first questions on bibliometric limitations

    What are the main limitations of bibliometrics in research assessment?

    The main limitations are field bias (citation norms differ by discipline), database coverage gaps (books, non-English and non-journal outputs are under-tracked), self-citation inflation, and a two-to-three-year lag before citation counts stabilise. Together these mean a single score cannot substitute for expert, qualitative judgement of research quality.

    Why is the h-index considered a poor measure of individual research quality?

    The h-index rewards volume and career length over insight, cannot distinguish a highly cited author from a member of a large collaborative team, and does not account for field-specific citation norms. Bergstrom, West and Wiseman (2008) concluded that reading the work and consulting experts remains the more reliable standard for individual evaluation.

    What is the difference between DORA and CoARA?

    DORA (2012) is a signable declaration focused primarily on eliminating Journal Impact Factor misuse. CoARA (2022) is a membership coalition of funders, universities and academies that goes further, committing signatories to a broader, peer-review-centred reform agenda across hiring, promotion, and institutional evaluation, with periodic reporting on progress.

    What is a self-citation rate and why does it matter?

    A self-citation rate is the proportion of an author’s or journal’s total citations that come from their own prior work rather than independent external uptake. Bibliometricians and citation-database auditors (including Clarivate’s Journal Citation Reports process) use unusually high self-citation rates as a flag for possible metric gaming rather than genuine scholarly influence.

    What should research administrators do differently?

    For research administrators and institutional leaders, the practical implication is not to discard citation data but to stop letting any single figure carry a hiring, promotion, or funding decision unsupervised. That means:

    • Pairing field-normalised indicators such as FWCI with narrative, qualitative peer assessment, as CoARA commitments require.
    • Auditing self-citation and journal self-citation patterns before citing a headline figure in a case file.
    • Recognising a fuller range of outputs — datasets, software, policy influence — rather than journal articles alone.
    • Crediting individual contributions on multi-author papers explicitly, rather than inferring credit from author position or aggregate citation share.

    On that last point, standardised contributor-role taxonomies address a related gap directly. CASRAI originated the CRediT contributor role taxonomy in 2014; the standard is now stewarded by NISO as ANSI/NISO Z39.104-2022, and it lets institutions record which named contributor performed which specific role on a paper — conceptualisation, data curation, writing — rather than relying on citation share or author-list position as a proxy for who did what.

    Where bibliometric reform goes next

    The evidentiary case against single-number bibliometric scores is now well established: field bias, database coverage gaps, self-citation gaming, and a multi-year reliability lag are documented, auditable failure modes, not theoretical objections. DORA and CoARA translate that evidence into institutional commitments, and field-normalised metrics such as FWCI narrow — without eliminating — the field-bias problem.

    The direction of travel for funders, universities and academies is toward layered assessment: responsibly used quantitative indicators, transparent contributor-role attribution, and peer judgement at the centre, rather than any one score standing alone.

  • Is Self-Citation Ethical in Responsible Metrics?

    Is self-citation ethical? Self-citation is ethical when an author cites their own prior work because it is genuinely relevant to a new argument, method, or dataset; it becomes unethical only when the primary motive shifts to inflating citation counts, h-index, or a journal’s impact factor. Neither DORA nor CoARA — the two dominant responsible-metrics frameworks — sets a self-citation rule, leaving this judgement almost entirely to editors, reviewers, and individual conscience.

    Self-citation is the practice of an author referencing their own previously published work within a new publication, most commonly to establish methodological continuity, avoid self-plagiarism, or trace the development of a research programme over time.

    What counts as self-citation, and why do researchers do it?

    Self-citation occurs whenever an author lists their own prior publication in a new paper’s reference list. It is neither rare nor inherently suspect: most research is cumulative, and a study that builds on a researcher’s earlier method, dataset, or theoretical framework has good reason to cite that earlier work directly.

    • Establishing methodological continuity with a previously validated technique or instrument
    • Avoiding self-plagiarism by properly attributing earlier text, data, or ideas
    • Tracing the trajectory of a multi-paper research programme for the reader
    • Providing background the author is best placed to cite because they generated the original finding

    The Committee on Publication Ethics (COPE) has noted that failing to cite one’s own directly relevant prior work can itself mislead readers into thinking a study is more novel than it is — so the ethical failure mode runs in both directions, not only toward over-citation.

    How much self-citation is considered excessive?

    There is no single, universally agreed self-citation rate ceiling. A 2023 analysis published in PMC concluded that a self-citation rate around 20 percent is conservatively tolerable for individual researchers, with rates substantially above that treated as inappropriate — but the same paper stresses that discipline size and publication norms shift what counts as normal.

    COPE’s own November 2017 forum discussion, “Self-Citation: Where’s the Line?”, found no consensus figure among editors. Some journals cap the absolute number of self-citations (for example, no more than five), others use a percentage-of-total-references ceiling, and many rely on case-by-case editorial judgement rather than a fixed rule. COPE’s broader position on handling citation manipulation asks journals to set their own thresholds and educate authors, rather than prescribing one number for the whole of scholarly publishing.

    A 2025 analysis in the Journal of Academic Ethics (Springer) reinforces the intent-based test over a rate-based one, concluding that “ethical reviewers should avoid unnecessary self-citation” while allowing that citing one’s own work is acceptable “if directly relevant” — the same relevance-over-frequency logic COPE applies.

    Why don’t DORA and CoARA address self-citation directly?

    The San Francisco Declaration on Research Assessment (DORA, 2012) is aimed squarely at eliminating the use of the journal impact factor as a proxy for individual researcher quality in hiring, funding, and promotion decisions. It says nothing about how many times an author may cite themselves within a paper’s reference list — that is a citation-practice question, not a journal-metric question, and sits outside DORA’s original scope.

    The Coalition for Advancing Research Assessment (CoARA), formed in 2022, commits signatory institutions to move away from inappropriate use of quantitative indicators and toward qualitative, narrative-based evaluation. This is the closest thing academia has to a responsible-metrics consensus position, yet CoARA’s Agreement likewise does not name self-citation as a distinct risk category — it addresses metric misuse at the institutional and assessment level, not individual reference-list behaviour.

    The result is a genuine governance gap. Self-citation sits between two policy domains — publication ethics (COPE’s territory) and research assessment reform (DORA and CoARA’s territory) — without either treating it as a first-class concern. Editors are left applying inconsistent journal-level rules, while institutional assessment reformers focus almost entirely on how metrics are used rather than on what feeds into them.

    Disclosure norms vs blanket caps: the better governance model

    A blanket percentage cap on self-citation is easy to state but poorly matched to how research actually varies. Small or emerging subfields with few active authors, first-in-series methodology papers, and long-running research programmes will all show naturally higher self-citation rates than a large, well-established field — penalising a rate rather than the intent behind it risks punishing legitimate continuity while doing little to stop a determined metric-gamer, who can simply keep self-citations just under whatever line is drawn.

    A more workable precedent already exists in bibliometrics. The standardized citation-metrics database maintained by Ioannidis, Boyack, and Baas — used to identify the world’s most-cited scientists across disciplines — reports each author’s composite citation score both with and without self-citations included, alongside their raw self-citation percentage. It does not impose a cutoff; it makes the number visible and lets the reader judge. That is a disclosure model, not a cap.

    Framework Year Position on self-citation Governance model
    COPE 2017/ongoing Case-by-case editorial judgement; no fixed universal threshold Journal-level policy, editorial discretion
    DORA 2012 Not addressed; targets impact-factor misuse in assessment Institutional assessment reform
    CoARA 2022 Not addressed; targets inappropriate metric use generally Institutional assessment reform
    Ioannidis/Boyack/Baas database 2019, updated annually Reports self-citation rate transparently alongside adjusted score Disclosure, no cap
    Individual journal caps Varies Fixed number or percentage limit on self-citations Blunt rule, inconsistently applied

    Applying that same logic to individual authors and grant applicants is straightforward: require a disclosed self-citation rate alongside any citation-based metric submitted for hiring, promotion, or funding decisions, rather than an arbitrary cap that cannot distinguish a legitimate methods lineage from deliberate metric inflation.

    Answer-first Q&A on self-citation ethics

    Is self-citation unethical?

    Self-citation is not inherently unethical. It becomes ethically problematic only when it is used to inflate citation metrics rather than to serve genuine scholarly continuity — what COPE treats as a form of citation manipulation. Relevance to the argument, not frequency, is the ethical test that matters.

    Is it okay to cite yourself in a research paper?

    Yes. Citing your own prior work is standard practice when it establishes methodological continuity, avoids self-plagiarism, or shows how a study builds on earlier findings. Problems arise only when self-citations serve no argumentative purpose beyond raising an author’s h-index or a journal’s impact factor.

    Is self-citation illegal?

    No. Self-citation is a matter of publication ethics, not law. Excessive or irrelevant self-citation can breach a journal’s editorial policy or COPE’s citation-manipulation guidance, potentially triggering a correction or editorial inquiry, but it carries no legal liability in any jurisdiction.

    Implications for journals, funders, and institutions

    Journals can adopt the disclosure model directly: require authors to report a manuscript’s self-citation percentage at submission, alongside a one-line rationale where the rate is unusually high, rather than enforcing an arbitrary cap during peer review.

    CoARA signatories reforming promotion and funding criteria are well placed to extend their existing move toward narrative CVs by asking applicants to disclose self-citation-adjusted metrics alongside any citation count submitted for assessment — consistent with CoARA’s broader commitment to context over raw indicators.

    DORA signatories evaluating individual researchers already commit to judging research on its own merits rather than by journal-level proxies; adding a self-citation disclosure line to that practice would close a gap the original 2012 declaration was never designed to cover.

    Conclusion: toward transparent, not punitive, norms

    Self-citation is not a solved problem in responsible metrics guidance — it is an unaddressed one. DORA targets journal-level metric misuse; CoARA targets institutional assessment culture; COPE offers editorial case law without a universal rule. None of the three treats individual self-citation disclosure as a named requirement.

    The fix does not need a new blanket percentage cap, which would misfire across disciplines of different sizes and publication norms. It needs a disclosure norm: report the self-citation rate, report the rationale where it is high, and let editors, funders, and hiring committees judge intent with that information in hand — the same logic that already underpins the field’s most credible standardized citation databases.

  • SciVal Bibliometrics vs the Leiden Ranking: Benchmarking Under DORA

    SciVal is Elsevier’s Scopus-based platform for benchmarking research output; the CWTS Leiden Ranking is Leiden University’s field-normalised ranking that deliberately avoids one composite score. Institutions increasingly run both together, but DORA warns that any league-table framing can reduce research quality to a single misleading number.

    SciVal bibliometrics refers to the citation and output metrics — including Field-Weighted Citation Impact (FWCI) — that Elsevier’s SciVal platform generates from Scopus data to support institutional research evaluation. Research offices now routinely pair this proprietary layer with the CWTS Leiden Ranking’s open, transparent indicators, creating a benchmarking workflow that sits in direct tension with the San Francisco Declaration on Research Assessment (DORA).

    What is SciVal and what does it measure?

    SciVal is Elsevier’s research-analytics platform, built on Scopus abstract-and-citation data, that lets subscriber institutions benchmark output, impact, and collaboration against named peer groups. It does not produce publicly indexed rankings; access is by institutional subscription, and outputs are configured per user for internal decision-making rather than public comparison.

    Core SciVal modules include:

    • Overview — publication and citation summaries for an entity over time
    • Benchmarking — side-by-side comparison against selected competitor or aspirational institutions
    • Collaboration — network maps of co-authorship at institutional, national, and international level
    • Trends — topic-level growth signals used for strategic investment decisions

    Its signature indicator is Field-Weighted Citation Impact (FWCI), the ratio of citations a set of publications actually received to the citations expected for publications of the same type, year, and subject field. A FWCI of 1.0 represents the world average for that field; values above 1.0 indicate above-average citation impact.

    How does the CWTS Leiden Ranking differ from SciVal?

    The CWTS Leiden Ranking, produced annually since 2007 by the Centre for Science and Technology Studies at Leiden University, is a free, publicly available ranking that explicitly refuses to combine indicators into one overall score. Instead it publishes separate, field-normalised tables — including MNCS (mean normalised citation score) and PP(top 10%), the proportion of an institution’s output among the world’s most-cited 10% of papers in its field.

    Where SciVal is a private diagnostic tool tuned to whatever comparator group an institution chooses, the Leiden Ranking is a public, methodologically documented instrument built for cross-institutional transparency. The distinction matters for governance: SciVal data informs internal strategy conversations, while Leiden Ranking data is citable externally by journalists, funders, and prospective students.

    Dimension SciVal CWTS Leiden Ranking
    Underlying data source Scopus Web of Science (Classic edition) or OpenAlex (Open Edition)
    Access model Institutional subscription Free and publicly browsable
    Composite score Configurable dashboards, no single mandated score Explicitly none — indicators kept separate by design
    Level of analysis Author, department, institution, custom groups Institution-level only
    Signature indicator Field-Weighted Citation Impact (FWCI) MNCS and PP(top 10%)
    Governing body Elsevier (commercial) CWTS, Leiden University (academic)

    Why does DORA caution against benchmarking with league tables?

    DORA, the San Francisco Declaration on Research Assessment published in 2012, calls on institutions to stop using journal- and rank-based proxies as substitutes for assessing the actual content of research. Its core recommendation is definitive: evaluators must not treat a journal impact factor, or by extension a university’s league-table position, as a surrogate measure of the quality of an individual researcher’s contribution.

    The UK’s Research Excellence Framework reinforces the same principle domestically — REF guidance instructs assessment panels not to rely on journal impact factors or bibliometric rankings when judging individual outputs. A single Leiden Ranking position or SciVal FWCI score, DORA argues, compresses genuinely multidimensional research performance into one figure that is easy to misuse in hiring, promotion, and funding decisions.

    How are research offices combining SciVal and Leiden in practice?

    A DORA-conscious workflow uses SciVal for granular internal diagnostics and the Leiden Ranking for transparent, external context — never letting either stand alone as a judgement on individual quality. In practice this looks like a two-stage process rather than a single dashboard export.

    1. Research offices first use SciVal to identify departmental strengths, emerging topics, and collaboration gaps against a self-selected comparator set.
    2. They then check institutional standing against the Leiden Ranking’s published, field-normalised indicators to see how that internal picture holds up against an independently governed, public dataset.
    3. Neither output is applied directly to an individual researcher’s promotion or tenure case, consistent with DORA’s requirement that assessment be based on the substance of the work.

    This “basket of metrics” approach — pairing a proprietary analytics tool with an open, non-composite ranking — is increasingly the model that DORA-signatory universities describe in their own research-assessment policies.

    What does the OpenAlex-based Leiden Ranking Open Edition change?

    Since 2023, CWTS has published a Leiden Ranking Open Edition built entirely on OpenAlex data, run alongside the long-standing Web of Science-based Classic edition. OpenAlex, launched by OurResearch in 2022 as a free successor to the discontinued Microsoft Academic Graph, indexes a broader and more open set of scholarly outputs than either Scopus or Web of Science.

    Because the Open Edition and Classic edition draw on different underlying databases, the same institution can show a materially different position depending on which edition is consulted — a fact rarely mentioned in library guidance on SciVal or Leiden alone. This is itself a practical argument for DORA’s caution: even among ostensibly objective, field-normalised rankings, the choice of data source alone can shift an institution’s apparent standing, before any interpretive judgement is applied.

    Common questions about SciVal bibliometrics

    Is SciVal the same as Scopus?

    No. Scopus is Elsevier’s underlying abstract-and-citation database; SciVal is a separate analytics layer built on top of Scopus data. Scopus supplies the raw publication and citation records, while SciVal turns them into benchmarking dashboards, Field-Weighted Citation Impact scores, collaboration maps, and trend reports for institutions and funders.

    What is SciVal used for?

    Research offices use SciVal to benchmark departments against named peers, track Field-Weighted Citation Impact and output trends, identify emerging research strengths, map collaboration networks, and build evidence for grant applications — functions distinct from external, public rankings such as the Leiden Ranking.

    What are the limitations of SciVal?

    SciVal’s field-normalisation depends on how Scopus classifies each publication’s subject field, which can misclassify interdisciplinary work. Coverage is limited to Scopus-indexed output, under-representing books and some social-science and humanities journals — a gap DORA cites when warning against treating any single metric as definitive.

    What metrics does SciVal provide?

    Core SciVal indicators include Scholarly Output, Citation Count, Field-Weighted Citation Impact (world average equals 1.0), Outputs in Top Citation Percentiles, and Collaboration metrics. These sit alongside Leiden-style indicators such as MNCS and PP(top 10%) used for external, field-normalised comparison.

    What this means for research administrators

    For research administration teams, the practical guidance is to treat SciVal and the Leiden Ranking as complementary diagnostic inputs, not verdicts. Any institutional report that cites either should disclose the comparator group, data source (Scopus, Web of Science, or OpenAlex), and the field-normalisation method applied, so that governance committees can judge the figures in context rather than as a rank alone.

    Where SciVal or Leiden data feeds into funding, hiring, or strategic planning, DORA-aligned institutions pair the quantitative output with qualitative peer assessment — a practice increasingly documented in the research-assessment policies of DORA-signatory universities.

    Where institutional benchmarking is heading

    As open bibliographic sources such as OpenAlex mature alongside proprietary platforms, expect research offices to triangulate across multiple data sources rather than anchor decisions to one dashboard or one ranking position. The direction of travel — visible in the Leiden Ranking’s own move to publish a parallel OpenAlex edition — is toward more transparent, multi-source benchmarking, precisely the “basket of metrics” model DORA has argued for since 2012.

    Research offices that document their methodology and keep SciVal, Leiden, and open datasets in dialogue with each other will be better placed to withstand scrutiny than those relying on any single proprietary score.

  • OpenAlex: The Case for Open Research Metrics

    OpenAlex is a free, CC0-licensed index of more than 319 million scholarly works, authors and institutions, built by the non-profit OurResearch to replace the discontinued Microsoft Academic Graph. For institutions weighing research-metrics platforms, its open data answers a question closed commercial indices cannot: who can audit the numbers behind an assessment decision.

    OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open-access mode, named after the Library of Alexandria. That single design choice — publishing the full dataset under a public-domain licence rather than behind a subscription wall — is what separates it structurally from Elsevier’s Scopus and Clarivate’s Web of Science, and why it has become a reference point in debates about research-assessment transparency.

    What Is OpenAlex?

    OpenAlex launched in January 2022, built by OurResearch (a US non-profit operating as Impactstory, Inc.) as a successor to the Microsoft Academic Graph, which Microsoft stopped updating on 31 December 2021. The project inherited MAG’s dataset and rebuilt it as an open, queryable graph of works, authors, institutions, funders, and topics.

    Two design decisions define the platform. First, the entire dataset is released under a Creative Commons Zero (CC0) licence, meaning any institution, developer, or researcher can download, redistribute, and build on it without permission or cost. Second, OpenAlex has formally adopted the Principles of Open Scholarly Infrastructure (POSI), a governance commitment covering sustainability, community control, and data portability.

    The scale is now substantial. OpenAlex’s own catalogue reports more than 319 million scholarly works, and its API handled roughly 115 million queries a month in 2024, according to figures cited in the platform’s Wikipedia entry. It draws source data from Crossref, ORCID, DOAJ, and Unpaywall rather than from a closed editorial pipeline.

    How Does OpenAlex Compare with Scopus and Web of Science?

    The practical difference is not just price — it is what each platform lets an institution verify. Scopus and Web of Science apply proprietary, selective journal-inclusion criteria and sell access to the resulting index. OpenAlex indexes broadly by default and publishes the inclusion logic as open code, which means an institution can inspect exactly why a work is or is not counted.

    Dimension OpenAlex Scopus (Elsevier) Web of Science (Clarivate)
    Governance Non-profit (OurResearch), POSI-aligned Commercial publisher Commercial data company
    Data licence CC0, fully open, bulk download Proprietary, licensed access only Proprietary, licensed access only
    Core journal metric No proprietary journal metric CiteScore (four-year citation average) Journal Impact Factor
    Coverage approach Broad, automated aggregation, strong Diamond OA and non-English coverage Curated, selective journal list Curated, selective journal list
    Cost to institutions Free API; optional paid support tier Subscription Subscription

    CiteScore, Scopus’s flagship journal metric, averages the citations a journal’s documents receive over a four-year window — a useful signal, but one calculated entirely inside a closed system that institutions cannot independently reproduce. OpenAlex does not publish an equivalent branded journal score; instead it exposes the underlying citation and work-level data so that any bibliometrician can calculate their own indicator and show their working.

    Coverage differences matter for equity as much as accuracy. A 2024 study cited in OpenAlex’s Wikipedia entry found the platform indexes more than 12,500 Diamond Open Access journal titles, including over 60% of Diamond OA journals absent from both Web of Science and Scopus — a direct consequence of not gating inclusion behind a commercial selection committee.

    Why Does Open Metrics Infrastructure Serve DORA’s Transparency Principle?

    The San Francisco Declaration on Research Assessment (DORA), first published in 2012, asks funders, institutions, and publishers to stop substituting journal-based proxies for direct evaluation of research and to be explicit about the criteria used in funding, hiring, and promotion decisions. That explicitness requirement is where the platform choice stops being neutral.

    A closed index can tell an institution that a number was calculated a certain way, but it cannot let that institution independently verify how, because the underlying citation graph is licensed, not published. An open metadata layer removes that opacity: the same dataset an institution cites in a tenure file or a funding report can be downloaded, re-run, and checked by anyone, including the researcher being assessed.

    Adoption evidence has followed the argument. Leiden University announced in September 2023 that it would produce an open-source edition of its CWTS Leiden Ranking using OpenAlex data from 2024 onward. Sorbonne University announced in December 2023 that it was withdrawing its Scopus subscription in favour of OpenAlex. In 2024, France’s Ministry of Higher Education and Research pledged financial support to the project, describing it as “crucial open science infrastructure,” and the Arcadia Fund awarded OurResearch a $7.5 million grant explicitly to build OpenAlex into a sustainable alternative to commercial citation indices.

    • Leiden University: open-source CWTS Leiden Ranking edition built on OpenAlex data (from 2024)
    • Sorbonne University: Scopus subscription withdrawn in favour of OpenAlex (December 2023)
    • French Ministry of Higher Education and Research: financial commitment to OpenAlex as open science infrastructure (2024)
    • Arcadia Fund: $7.5 million grant to OurResearch for OpenAlex sustainability (March 2024)

    None of this means closed indices lack value; their curated selection and mature analytics tooling still suit some high-stakes evaluations. But where the explicit requirement is transparency rather than convenience, an auditable, CC0-licensed data layer meets DORA’s stated principle more directly than a licensed black box.

    Common Questions About OpenAlex

    What is OpenAlex used for?

    Universities, funders, and publishers use OpenAlex to track publication output, measure open-access status, benchmark institutional performance, and feed alternative rankings such as the open-source CWTS Leiden Ranking. Its free API also underpins third-party dashboards, systematic-review tools, and research-information systems that need citation and affiliation data without a subscription fee.

    Is OpenAlex legit?

    Yes. OpenAlex is maintained by OurResearch, a non-profit with a multi-year record of building open scholarly infrastructure, and it has formally adopted the Principles of Open Scholarly Infrastructure (POSI). Its data and methodology are openly licensed and auditable, and the platform is already cited in peer-reviewed scientometrics research, including a 2022 arXiv paper by its founders.

    Is OpenAlex free?

    Yes. The full dataset is released under a Creative Commons Zero (CC0) public-domain licence, and the REST API can be queried without a subscription, unlike Scopus or Web of Science. A polite-pool rate limit applies to unauthenticated use, and OurResearch offers an optional paid support tier for high-volume institutional queries.

    Who owns OpenAlex?

    OpenAlex is created and maintained by OurResearch, a US-based non-profit operating as Impactstory, Inc., not by a commercial publisher. Governance sits with a mission-driven organisation rather than a shareholder-owned company — the structural distinction that underpins its CC0 licensing and its appeal to institutions pursuing publisher-independent, DORA-aligned metrics.

    What Should Institutional Leaders Do Next?

    Platform choice is now a governance decision, not just a procurement one. An institution that cites OpenAlex data in a promotion case, a funding report, or an open-access dashboard is making a transparency claim as well as a metrics claim, and that claim should be tested before it is relied upon.

    • Map which existing assessment workflows (tenure, funding reports, rankings submissions) rely on a metric an evaluator cannot independently reproduce.
    • Pilot OpenAlex alongside — not instead of — existing subscriptions, comparing coverage gaps directly against Scopus or Web of Science outputs for your own institutional corpus.
    • Document data provenance explicitly in assessment criteria, consistent with DORA’s requirement for stated, auditable methodology.
    • Track POSI-aligned infrastructure commitments (OpenAlex, CrossRef, ORCID, ROR) as the durable layer beneath any commercial tool an institution also chooses to license.

    Open, non-proprietary metadata will not replace every function a commercial index performs today. But as funders and assessment reformers keep pressing for auditable evidence over proprietary scores, institutions that already understand — and can reproduce — their own metrics will be the ones best placed to defend them.

  • OpenAlex API: Building a Metrics Dashboard

    The OpenAlex API is a free, fully open REST interface to a catalogue of hundreds of millions of scholarly works, authors, institutions and funders, and it is the most practical data source for building an in-house institutional research metrics dashboard without a subscription. Query the /works endpoint with an institution filter, aggregate with group_by, and you have publication counts, open-access share and citation-percentile data in a single JSON response.

    OpenAlex is an open, CC0-licensed catalogue of the global research system — works, authors, institutions, sources, funders and topics — built and maintained by the non-profit OurResearch as a successor to the discontinued Microsoft Academic Graph. Because every record and the API itself are free to query, research offices can build metrics dashboards without licensing a commercial bibliometrics platform, provided they understand the filter syntax, pagination limits and the metric gaps this guide covers.

    What is the OpenAlex API and what does it cover?

    The OpenAlex API exposes entity endpoints — Works, Authors, Institutions, Sources, Topics, Funders and Awards — each accessed at https://api.openalex.org/{entity}. Every entity supports four operations: list, get (by ID), filter, and group_by (server-side aggregation), which together are the building blocks of a dashboard.

    Each entity carries a persistent OpenAlex ID and, for institutions, a cross-walked ROR identifier — the Research Organization Registry ID also used by ORCID, Crossref and DataCite. Filtering on an institution’s ROR-linked OpenAlex ID, rather than a free-text name match, is what keeps a dashboard’s institutional attribution stable as an organisation’s name or subsidiary structure changes.

    Entity endpoint Dashboard use case Example filter
    /works Publication counts, open-access share, citation percentiles authorships.institutions.id
    /authors Researcher productivity, h-index-style summary stats affiliations.institution.id
    /institutions Peer benchmarking, collaboration networks ror
    /topics Subject-area concentration and trend detection works_count

    How do you query the Works endpoint for institutional metrics?

    Every institution-level query starts with the authorships.institutions.id filter set to the institution’s OpenAlex ID, which you resolve once via /institutions?filter=ror:https://ror.org/{your-ror-id}. From there, combine filters with commas (AND logic) and pipes (OR logic), and add group_by to turn a list query into an aggregation query in one request — no client-side loop required.

    • Publication trend: /works?filter=authorships.institutions.id:I123...,publication_year:2020-2026&group_by=publication_year
    • Open-access share: add &group_by=oa_status to the same filter to split output into gold, green, hybrid, bronze and closed counts.
    • Field distribution: &group_by=primary_topic.field.id reveals subject concentration across an institution’s output.
    • Collaboration mapping: &group_by=authorships.institutions.id returns co-publishing partner institutions ranked by shared-work count.

    Use the select parameter to strip unused fields from large responses, and switch from offset-based page/per_page pagination to cursor pagination once a query’s meta.count exceeds roughly 10,000 results — offset pagination is capped and will silently stop returning new pages beyond that depth.

    How do you approximate field-weighted citation impact with OpenAlex data?

    Field-weighted citation impact (FWCI) is a proprietary metric popularised by Elsevier’s SciVal and Scopus products, calculated by comparing a work’s citations to the average for same-year, same-subject, same-document-type publications; OpenAlex does not expose a field literally called “FWCI”, and no open API replicates the Scopus subject-classification baseline it is normalised against.

    OpenAlex’s nearest open equivalent is the cited_by_percentile_year object returned on every work record, which gives a min/max percentile rank of that work’s citation count against all works of the same publication year and type. Aggregating this field across an institution’s output — for example, the share of works in the top decile (percentile ≥ 90) per year — produces a transparent, reproducible citation-impact proxy that a dashboard can compute without a commercial licence, though it is not interchangeable with SciVal’s FWCI for benchmarking against institutions that report the Scopus figure.

    For most dashboards the honest approach is to present both: raw citation counts (context-dependent, not comparable across fields) and the percentile-year proxy (comparable within OpenAlex’s corpus), clearly labelled as distinct from any vendor-reported FWCI value cited in external reports.

    What are the authentication, rate-limit and pricing rules?

    OpenAlex’s underlying dataset, website and API are free and the data is CC0-licensed, so no purchase is required to query or redistribute results. Every request should still include a contact identifier — either a mailto query parameter with your email address or a registered api_key — to enter the “polite pool”, which OurResearch prioritises over anonymous traffic for faster, more consistent response times.

    Requests without a mailto parameter or API key are routed to a slower, lower-priority pool and are more likely to be throttled during peak load; this single parameter is the most common fix for intermittent 429 or timeout errors reported by developers building batch-harvesting scripts. Dashboard builders scheduling nightly refresh jobs should always set mailto or an API key rather than relying on the anonymous pool.

    Common developer questions

    Is the OpenAlex API free?

    Yes. OpenAlex is free to query, and the underlying data is licensed under CC0, meaning it can be reused and redistributed without royalties. Registering an email via the mailto parameter or an API key gives access to the faster “polite pool” but does not change the underlying no-cost model.

    Does OpenAlex have an API for institutional data?

    Yes. The Institutions endpoint returns disambiguated organisation records cross-walked to ROR identifiers, and the Works endpoint accepts an authorships.institutions.id filter, which is the standard way to scope any query to a single institution’s publication output for a dashboard.

    What is OpenAlex used for in research administration?

    Research offices use OpenAlex to track publication trends, open-access compliance, collaboration networks and topic concentration without paying for a commercial bibliometrics subscription. Its open licence also makes it suitable for public-facing institutional reporting, since results can be republished without redistribution restrictions.

    Implications for institutional research offices

    A dashboard built directly on the OpenAlex API gives research administration teams a free, auditable alternative to proprietary bibliometrics tools for routine reporting — publication counts, open-access compliance tracking and collaboration mapping — while reserving paid platforms for tasks that genuinely require vendor-normalised metrics such as reported FWCI. The trade-off is that teams take on the engineering work themselves: handling pagination beyond 10,000 results, keeping institution ID mappings current as ROR records change, and documenting clearly that a percentile-based proxy is not the same figure a funder or ranking body may expect from Scopus.

    As OpenAlex’s topic classification and percentile fields mature, the gap between what a free, transparent API can deliver and what a paid platform delivers continues to narrow for most day-to-day institutional reporting needs, making a well-built in-house dashboard an increasingly credible default rather than a stopgap.

  • What Is Bibliometrics? A Research Office Primer

    Bibliometrics is the quantitative analysis of scholarly publications and the citations between them, used to measure research output, impact and collaboration patterns. For a research office, the practical challenge is rarely gathering these numbers — library systems, funders and university dashboards supply them constantly — but recognising which of the three main types of bibliometrics a given report represents, and what it can and cannot responsibly tell you.

    In its simplest form, bibliometrics is the statistical analysis of books, articles and other publications, most often using citation counts to describe patterns in scholarly communication. That one-line definition, drawn from the OECD’s usage and echoed by university library guides, is the starting point for everything that follows.

    What is bibliometrics?

    Bibliometrics applies statistical methods to bibliographic data — publication counts, citation counts, co-authorship networks and, increasingly, download and mention data — to describe and evaluate scholarly activity. It sits alongside scientometrics, a closely related field that extends the same statistical logic to science and technology output more broadly; in practice research offices treat the two terms as near-synonyms.

    Eugene Garfield, founder of the Institute for Scientific Information and creator of the Science Citation Index in 1964, is widely credited as a founding figure of modern bibliometrics. His citation-indexing work established the infrastructure — later commercialised as Web of Science — that most present-day bibliometric reporting still depends on.

    A metrics report a research office receives is rarely a single “bibliometric score.” It is usually a blend of three distinct analytical modes, and conflating them is the single most common source of misread reports.

    What are the three types of bibliometrics?

    Library and information science distinguishes descriptive, evaluative and relational bibliometrics. Each answers a different question, and each carries a different risk of misinterpretation when applied outside its proper scope.

    Type Core question it answers Typical output Main risk if misread
    Descriptive How much has been published, by whom, where? Publication counts, output by year, discipline or department Treated as a quality signal when it only measures volume
    Evaluative How much impact or influence has that output had? Citation counts, h-index, Journal Impact Factor, Field-Weighted Citation Impact Used to rank individuals directly, ignoring field and career-stage differences
    Relational How are researchers, topics or institutions connected? Co-authorship networks, co-citation maps, research-front clustering Read as a measure of quality rather than of structure or collaboration

    Descriptive bibliometrics is the safest category for research offices to report externally, because it counts rather than judges. Evaluative bibliometrics is the category most prone to misuse — a single h-index or Journal Impact Factor figure says nothing about an individual paper’s quality. Relational bibliometrics is the least familiar to non-specialists but the most useful for identifying emerging collaboration opportunities or research strengths across a department.

    What bibliometric indicators will appear in a metrics report?

    Most institutional metrics reports combine a handful of recurring indicators. Knowing which category each one belongs to prevents a descriptive count being read as an evaluative judgement.

    • Citation count — the raw number of times a work has been cited; evaluative, but highly field- and age-dependent.
    • h-index — an author-level figure meaning a researcher has h publications each cited at least h times; evaluative, and known to disadvantage early-career researchers and those in low-citation-rate fields.
    • Journal Impact Factor (JIF) — the average citations per article in a journal over the preceding two years; a journal-level, not an article-level, indicator.
    • Field-Weighted Citation Impact (FWCI) — a normalised indicator comparing a publication’s citations against the global average for its subject, document type and publication year; a value above 1 indicates above-average performance for that field.
    • Altmetrics — non-citation signals such as policy-document mentions, news coverage, social media activity and downloads, which supplement rather than replace citation-based evaluation.

    These indicators are drawn from different underlying databases, and coverage varies. Web of Science and Scopus apply curated, subscription-based indexing; Google Scholar offers broad, free coverage with less curation; Dimensions links publications to grants and clinical trials on a freemium basis. A report’s headline number can shift depending on which source supplied it.

    How should research offices use bibliometrics responsibly?

    Bibliometrics should inform, not replace, expert judgement. Three widely referenced frameworks set out how research offices can operationalise that principle rather than treat it as an aspiration.

    The San Francisco Declaration on Research Assessment (DORA), launched in 2012, commits signatory institutions to avoid using journal-based metrics such as the Journal Impact Factor in hiring, promotion or funding decisions. Imperial College London, for example, states it has applied this commitment since becoming a DORA signatory in 2017.

    The UK’s Metric Tide review, commissioned by the then Higher Education Funding Council for England (now part of UK Research and Innovation) and published in 2015, set out five principles for responsible metrics: robustness, humility, transparency, diversity and reflexivity. Those five principles remain the reference point most UK research offices cite when drafting internal metrics policies.

    INORMS’ Research Evaluation Working Group publishes the SCOPE framework — Start, Context, Options, Probe, Evaluate — a five-step method research administrators can apply before commissioning or interpreting any metrics report, rather than defaulting to whichever indicator a database happens to surface first.

    • Start by clarifying the purpose of the evaluation before selecting any indicator.
    • Establish the context: discipline, career stage, output type and comparator group.
    • Identify the options available, including qualitative alternatives such as peer review or narrative CVs.
    • Probe the suitability and limitations of each proposed indicator.
    • Evaluate the process itself once the assessment is complete, and refine it for next time.

    Momentum toward narrative-based assessment has also grown outside the UK: the 2022 Coalition for Advancing Research Assessment (CoARA), joined by many Horizon Europe-affiliated funders and institutions, commits signatories to reduce reliance on journal-based and output-count metrics in funding and hiring decisions.

    It is worth distinguishing bibliometrics from contributor-level attribution. Bibliometrics counts citations and outputs; it does not record who did what on a given paper. CASRAI originated the CRediT contributor role taxonomy in 2014 for that separate purpose, and the standard is now stewarded by NISO as ANSI/NISO Z39.104-2022. A research office reconciling a bibliometrics report with authorship disputes should reach for CRediT roles, not citation counts.

    Common questions about bibliometrics

    What is the meaning of bibliometric?

    “Bibliometric” describes any measure derived from the statistical analysis of published scholarly output — most commonly publication counts and citation counts. The term covers the underlying data point (a bibliometric) and the wider field that studies it (bibliometrics), and applies equally to authors, journals, institutions and individual articles.

    What is an example of bibliometrics?

    The h-index is the most commonly cited example: an author with an h-index of 20 has 20 publications that have each received at least 20 citations. Other everyday examples include a journal’s Impact Factor, a department’s annual publication count, and a co-authorship map showing collaboration between institutions.

    What is bibliometrics in simple terms?

    In simple terms, bibliometrics counts and analyses publications and citations to show how much research is being produced and how much attention it receives. It turns scattered publication records into structured evidence — useful for funding reports and CVs, but never a full substitute for reading the work itself.

    Who is the father of bibliometrics?

    Eugene Garfield (1925–2017) is widely regarded as the founding figure of bibliometrics and scientometrics. As founder of the Institute for Scientific Information, he created the Science Citation Index in 1964, establishing the citation-indexing infrastructure that underpins most bibliometric analysis conducted today.

    What this means for research offices

    A metrics report that blends descriptive, evaluative and relational bibliometrics without labelling which is which will inevitably be misread by whoever receives it next — a promotion panel, a funder, or a departmental head. Labelling each figure by type, and pairing evaluative indicators with the field-normalisation context they need, is a low-cost fix most research offices can apply immediately.

    As narrative-assessment frameworks such as CoARA and DORA gain signatories, research offices should expect bibliometric reports to sit alongside, not instead of, qualitative evidence in funding and promotion decisions. Building that dual capability now — clear metrics literacy plus a credible narrative-CV process — will matter more with each assessment cycle, not less.

  • PlumX vs Altmetrics: Compare Coverage Gaps

    PlumX Metrics and Altmetrics both track online attention to research outputs, but they are not interchangeable: PlumX organises data into five uncombined categories (Citations, Usage, Captures, Mentions, Social Media), while Altmetric.com aggregates sources into a single weighted Altmetric Attention Score. Choosing between PlumX vs Altmetrics for assessment therefore depends on whether an institution needs a granular breakdown or a single comparable number — and on disclosing what each tool does not cover.

    Altmetrics, as a category, is the collective term for non-citation indicators of research attention — social media mentions, news coverage, policy citations, blog posts, and readership counts — used alongside, not instead of, traditional bibliometrics.

    What is PlumX Metrics?

    PlumX Metrics is an altmetrics service developed by Plum Analytics, now owned by Elsevier and embedded directly into Scopus and SciVal article records. It does not produce a single composite score. Instead, it sorts attention data into five discrete categories: Citations, Usage, Captures, Mentions, and Social Media, displayed as a five-segment “Plum Print” whose circle sizes scale with activity in each bucket.

    The University of Waterloo Library notes that PlumX “deliberately does not aggregate their altmetric data sources into a single score” — a design choice that keeps categories separable but makes cross-article ranking harder than with a single number.

    What are Altmetrics and the Altmetric Attention Score?

    Altmetric.com, part of Digital Science, is the best-known commercial provider of the broader “altmetrics” concept. It compresses attention data from news outlets, blogs, policy documents, X/Twitter, and other sources into a single weighted number — the Altmetric Attention Score — visualised as a multicoloured “donut” where each segment represents a source type.

    This single-score design makes Altmetric easier to sort and benchmark at scale across large publication sets, which is why publisher platforms including Wiley and Springer Nature embed the Altmetric badge directly on article pages.

    PlumX vs Altmetrics: data sources and categories compared

    A 2024 study in Learned Publishing by Rasuli directly tested coverage differences between the two tools and found neither is universally superior: Altmetric.com had the strongest coverage of blogs, news articles, and X/Twitter mentions, while PlumX showed better coverage of Mendeley reader counts. That finding — a real, citable data point — is the clearest evidence that the two tools are complementary, not substitutable.

    Dimension PlumX Metrics Altmetric (Attention Score)
    Owner Plum Analytics, part of Elsevier Altmetric.com, part of Digital Science
    Category structure Five uncombined categories: Citations, Usage, Captures, Mentions, Social Media Single weighted score plus a source-level breakdown
    Composite score None — a five-category “Plum Print” visual Yes — the Altmetric Attention Score, one number
    Documented strength (Rasuli, 2024) Mendeley reader counts Blogs, news, and X/Twitter mentions
    Primary institutional integration Scopus and SciVal article records Altmetric Explorer; publisher platforms (Wiley, Springer Nature)

    Both tools also include a distinct Citations dimension in their scope: PlumX’s Citations category explicitly folds in Scopus citation counts alongside patent, clinical, and policy citations, while Altmetric treats citation data as a separate, secondary layer rather than a core category.

    What coverage gaps should institutions disclose?

    Neither tool captures the full universe of research attention, and both have known blind spots that assessment reports should state explicitly rather than imply away:

    • Platform coverage is uneven. The Rasuli (2024) comparison shows each tool systematically under-represents sources the other captures better, so a low score on one platform does not mean low attention overall.
    • Absence of a score is not absence of impact. An article with no PlumX or Altmetric activity may simply lack a DOI-linked record, an institutional repository deposit, or public discussion in the tracked window — not lack of scholarly value.
    • Composite scores obscure source mix. A single Altmetric Attention Score can be driven almost entirely by one viral social post; disclosure should note the underlying source breakdown, not just the headline number.
    • Gaming and reproducibility risk. NISO’s Recommended Practice RP-25-2016, the output of its Alternative Assessment Metrics initiative, explicitly flags data-quality, persistent-identifier, and manipulation-resistance requirements that altmetrics providers and institutions using them should meet.
    • Metrics indicate attention, not quality. INORMS’s SCOPE framework for responsible research evaluation stresses that any metric — including altmetrics — should be interpreted only against the specific purpose it was chosen to serve, not treated as a proxy for research quality.

    Research administrators compiling assessment dossiers should state which tool was used, the date the score was pulled, and which categories were included — omitting this context is the most common disclosure failure institutions make when citing either platform.

    Answer-first Q&A

    Is Altmetric reliable?

    Altmetric is reliable as an indicator of online attention, not as a quality measure. Because it harvests data from many external, non-standardised sources, coverage varies by discipline and platform, so scores should be interpreted alongside citation data rather than in isolation, per NISO’s altmetrics recommended practice.

    What is the difference between altmetrics and bibliometrics?

    Bibliometrics measure scholarly interest through formal citation counts in indexed literature, while altmetrics track online engagement — downloads, mentions, shares, and discussion — across academic and public channels. The two measure different things and are designed to complement, not replace, each other.

    Is PlumX Metrics free?

    The PlumX artifact widget is free for any digital object with a DOI and can be embedded on repository or publication pages at no cost. Full institutional dashboards and analytics through Scopus/SciVal, however, require a paid Elsevier subscription.

    What is the difference between Altmetric and PlumX?

    Altmetric compresses attention into one weighted Attention Score with a donut visual, while PlumX keeps five categories separate in a Plum Print graphic with no combined number. The practical difference is aggregation: one number for quick ranking versus five categories for granular review.

    Implications for research assessment

    As institutions build multi-metric assessment dashboards, the choice is rarely PlumX or Altmetric — most research-intensive universities license both, because Scopus-indexed institutions already have PlumX embedded and many also subscribe to Altmetric Explorer for its stronger media and policy tracking. What matters for defensible assessment practice is documenting scope: which categories were pulled, on what date, and which known coverage gaps apply.

    Frameworks such as INORMS’s SCOPE model give research administration teams a structure for that documentation, tying metric choice back to the specific evaluative purpose rather than treating either tool’s output as a self-evident ranking. Consult the CASRAI Dictionary for definitions of related terms such as citation, altmetrics, and bibliometrics when drafting assessment policy language.

  • Field-Weighted Citation Impact: Where It Fails

    Field-weighted citation impact (FWCI) is a Scopus-derived metric that divides a publication’s actual citation count by the citation count expected for similar documents in the same subject field, publication year and document type — a result of 1.0 marks the global average, above 1.0 marks above-average impact. Before an institution builds review, promotion or tenure (RPT) criteria around it, the underlying normalisation assumptions need scrutiny.

    Field-weighted citation impact is defined by Elsevier as the ratio of citations actually received by an output to the citations that would be expected based on the average for the global scientific output of the same subject field, publication year and document type. It is calculated using Scopus data and surfaced through SciVal and Pure.

    What is field-weighted citation impact?

    Field-weighted citation impact is a normalised, article-level citation metric built into Elsevier’s SciVal and Scopus platforms. It expresses how a specific output, author, or institution has been cited relative to a global benchmark of comparable publications, rather than in raw citation counts that inevitably favour older papers and citation-heavy fields such as biomedicine.

    An FWCI of 1.48 means a document has been cited 48% more than expected for its field, year and type. An FWCI of 0.6 means it has been cited 40% less than expected. Because the benchmark is fixed at 1.0 by construction, roughly half of all outputs in any given field will sit below that line — a distributional fact that is frequently lost in institutional reporting.

    How is FWCI calculated?

    The field-weighted citation impact formula is simple on its face: FWCI = actual citations received ÷ expected citations for similar documents. The “expected” figure is the average citation count for all Scopus-indexed documents sharing the same Scopus subject classification (ASJC code), publication year, and document type (article, review, conference paper, and so on).

    • A microbiology article published in 2023 that has received 20 citations, against a field average of 10 for similar 2023 microbiology articles, scores an FWCI of 2.0.
    • A humanities article with 3 citations against a field average of 2 scores an FWCI of 1.5 — a superficially similar score built on a far smaller, more volatile citation base.
    • SciVal aggregates FWCI across an author’s or institution’s full output set by summing actual citations and expected citations separately, then dividing the totals — not by averaging individual FWCI scores.

    This matters: a single highly cited outlier can lift a whole portfolio’s FWCI, which is why SciVal documentation recommends reading FWCI alongside output volume and citation distribution, not as a standalone score.

    FWCI vs CiteScore and the Journal Impact Factor

    FWCI is often confused with journal-level metrics because all three numbers look similar — a decimal hovering near 1 to 10. They measure different things at different units of analysis, which is the first source of misapplication in policy documents.

    Metric Unit of analysis Field-normalised? Source and window
    Field-weighted citation impact (FWCI) Article, author, or institution Yes — field, year, document type Scopus data via SciVal; typically a rolling multi-year citation count
    CiteScore Journal No Elsevier/Scopus; launched December 2016; citations in a year to the prior 3 years of documents
    Journal Impact Factor (JIF) Journal No Clarivate Journal Citation Reports; historically a 2-year citation window

    Neither CiteScore nor the JIF adjusts for subject field, so comparing a mathematics journal’s CiteScore to an oncology journal’s compares citation cultures, not quality. FWCI’s field normalisation is what DORA-aligned reformers have asked journal metrics to do and mostly do not — which is also why FWCI is sometimes waved through review committees as the “responsible” metric without further scrutiny.

    Where FWCI breaks down: five assumptions to scrutinise

    FWCI’s field normalisation is a genuine improvement over raw citation counts and journal-level proxies, but it inherits several assumptions that DORA-aligned institutions should test before writing it into criteria.

    • Mean-based benchmarking, not percentile-based. FWCI compares an output to the field average, but citation distributions are heavily right-skewed: a small number of highly cited papers pull the mean upward, so most papers structurally score below 1.0 even when performing typically for their field. This is precisely why the Centre for Science and Technology Studies (CWTS) at Leiden University uses percentile-based indicators, such as the share of a unit’s output in the global top 10% most-cited, in its Leiden Ranking methodology rather than a mean-normalised ratio.
    • Subject classification is assigned to journals, not articles. Scopus’s ASJC subject codes are largely applied at the source-title level. An interdisciplinary article published in a broad-scope journal inherits that journal’s field classification, which can misrepresent the “expected” citation benchmark for a genuinely cross-disciplinary piece of work.
    • Small-sample volatility. For low-citation fields (much of the humanities, parts of engineering and mathematics) or for single articles, a difference of one or two citations can swing FWCI dramatically, because the expected-citation denominator is itself small. A score of 2.0 built on 20 citations is far more stable than one built on 2.
    • Self-citation is not excluded by default. Author, institutional, and journal self-citation inflate the numerator unless a self-citation exclusion is explicitly applied — a configurable option in SciVal, but one that is easy to omit when scores are pulled into a spreadsheet for a committee.
    • A single number cannot represent research quality, originality, or societal value. FWCI measures citation uptake within a fixed window; it says nothing about methodological rigour, reproducibility, data sharing, or the qualitative judgement DORA asks assessors to exercise in its place.

    Should FWCI drive review, promotion and tenure decisions?

    The San Francisco Declaration on Research Assessment (DORA), issued in 2012, recommends that institutions not use journal-based metrics as a surrogate for the quality of individual articles, individual researchers’ contributions, or as inputs to hiring, promotion, and funding decisions. FWCI’s article-level, field-normalised design addresses DORA’s specific objection to journal-level proxies such as the JIF — but it does not exempt FWCI from DORA’s broader principle that quantitative indicators should supplement, not replace, expert reading of the work itself.

    Institutions building RPT criteria around FWCI should require committees to read the underlying subject classification applied to a candidate’s outputs, check whether self-citations are excluded, and treat single-digit-citation scores as statistically unstable rather than definitive. A candidate’s FWCI trend across a full portfolio, read alongside narrative evidence of contribution, is a materially more defensible signal than a single score cited in isolation.

    As UK Research and Innovation and equivalent funders continue to align assessment frameworks with responsible-metrics principles, institutions that document how they weight FWCI against qualitative peer judgement — rather than adopting it as a pass/fail threshold — will be better positioned to defend their research administration processes to auditors, funders, and appeals panels alike.

    Frequently asked questions

    What is the average FWCI?

    The global average FWCI is always 1.0 by mathematical construction, because the benchmark for “expected citations” is itself the average of comparable outputs. A score above 1.0 indicates above-average citation performance for that field, year, and document type; a score below 1.0 indicates below-average performance.

    How do I get my field-weighted citation impact?

    FWCI is retrieved through a SciVal subscription, where institutional users can search an author, publication set, or institution and view the FWCI directly on the metrics dashboard. Some institutions also surface FWCI through Pure, which synchronises the metric from Scopus on a scheduled basis where the integration is enabled.

    What is field-weighted citation impact ranking?

    FWCI is not itself a ranking system — it is a ratio, not a percentile or league-table position. Institutions sometimes rank authors, departments, or outputs by their FWCI scores internally, but this practice inherits all the mean-based and small-sample limitations described above and should be treated cautiously.

    Is field-weighted citation impact the same as CiteScore?

    No. FWCI operates at the article, author, or institution level and is field-normalised; CiteScore is a journal-level average citation rate with no field normalisation. A journal’s CiteScore says nothing about how any single article within it actually performed relative to its field.

    FWCI remains one of the more defensible citation metrics precisely because it was built to correct the field-blindness of journal-level indicators. Its value depends entirely on institutions applying it the way its own documentation recommends: alongside output volume, subject classification checks, and self-citation controls — not as a solitary number standing in for expert judgement in a promotion file.

  • CiteScore vs Impact Factor Under DORA and CoARA

    In the citescore vs impact factor comparison, neither metric wins under research-assessment reform: CiteScore (Elsevier/Scopus) tracks citations across a four-year window and all document types, while Journal Impact Factor (Clarivate/Web of Science) uses a two-year window limited to “citable items” — and DORA and CoARA both instruct assessors not to use either as a proxy for research quality.

    CiteScore is Elsevier’s Scopus-based journal metric, calculated by dividing the citations a title receives in a calendar year by the number of documents it published across the preceding four years. Journal Impact Factor (JIF) is Clarivate’s older, narrower equivalent, published annually through the Journal Citation Reports (JCR). Both numbers get quoted constantly in tenure files, funding applications and journal marketing — and both are formally out of step with how research-assessment reform now says journals should be judged.

    What Is the Difference Between CiteScore and Impact Factor?

    The core difference is database, window length, and document scope. CiteScore draws on Scopus and counts citations to every document type — articles, reviews, conference papers, book chapters, data papers and editorial material — over a rolling four-year window. Journal Impact Factor draws on Web of Science and restricts its denominator to “citable items” (chiefly research articles and reviews) over a two-year window, even though its numerator counts citations to all document types.

    That asymmetry in JIF’s own formula — a broad numerator over a narrow denominator — is one of the most persistent, well-documented criticisms of the metric, and is a large part of why CiteScore, introduced by Elsevier in December 2016, was built with a wider document scope from the outset.

    Feature CiteScore Journal Impact Factor
    Provider Elsevier Clarivate
    Underlying database Scopus Web of Science (Journal Citation Reports)
    Citation window 4 years 2 years
    Document types counted All document types Primarily “citable items” (articles, reviews)
    Access Free on Scopus journal pages Requires a JCR subscription
    First introduced December 2016 Concept 1955; JCR published annually since 1975

    How Is Each Metric Calculated?

    CiteScore for year Y equals citations received in Y to documents published in Y-3 through Y, divided by the number of documents published across that same four-year span. Elsevier updates a “CiteScore Tracker” monthly, so the figure moves before the annual snapshot is finalised — a transparency feature JIF does not offer.

    Journal Impact Factor for year Y equals citations received in Y to items published in Y-1 and Y-2, divided by the number of “citable items” published in those same two years. Clarivate publishes the finalised figure once a year through the Journal Citation Reports, alongside a JIF quartile ranking within each subject category.

    • Shorter windows (JIF) react faster to hot topics but are more volatile for low-volume or slow-citing fields.
    • Longer windows (CiteScore) smooth out volatility but can undervalue journals in genuinely fast-moving disciplines.
    • Neither window length is “correct” — both were chosen as engineering trade-offs, not as validated proxies for quality.

    What Do DORA and CoARA Say About Journal-Level Metrics?

    The San Francisco Declaration on Research Assessment (DORA), published in 2012 and now signed by tens of thousands of individuals and organisations across more than 160 countries, states that journal-based metrics — explicitly including Impact Factor — should not be used “as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions.” Although DORA’s original text names JIF, the same critique applies directly to CiteScore: both are journal-level averages applied to individual outputs and individual people.

    The Coalition for Advancing Research Assessment (CoARA), launched in 2022 and coordinated with the European University Association, commits its signatories — now numbering hundreds of universities, funders and research organisations — to “abandon inappropriate uses in research assessment of journal- and publication-based metrics, in particular any inappropriate uses of Journal Impact Factor.” CoARA’s Agreement treats CiteScore as falling under the same prohibition, since its ten commitments target the practice of journal-metric substitution for quality judgement, not one specific brand of metric.

    Neither declaration asks institutions to abolish CiteScore or JIF outright. Both ask assessors to stop using either figure as a shortcut for reading, or for judging, the individual piece of work in front of them.

    CiteScore vs Impact Factor: Which Survives Assessment Reform?

    Under DORA and CoARA criteria, neither metric “survives” as a legitimate proxy for individual-level quality — but CiteScore scores better on two specific reform tests: transparency and access. Its underlying Scopus data and monthly tracker are freely visible; JIF’s Web of Science data sits behind a JCR subscription, which is one reason CiteScore is often described as the more auditable of the two.

    Jurisdiction-specific policy already reflects this shift. The UK’s Research Excellence Framework (REF) guidance instructs assessment panels not to use journal-level metrics, including Impact Factor, as a proxy for output quality — panel members are required to read and judge the submitted work itself. Frameworks such as the Leiden Manifesto (2015) and the UK’s Metric Tide review (2015) reach the same conclusion from a different angle: any single citation metric, however calculated, is a partial and gameable signal that needs qualitative context, not a standalone score.

    In practice, most responsible-assessment guidance converges on the same answer: use CiteScore or JIF only as one directional data point about a journal’s citation behaviour — never as a stand-in for peer review, narrative CVs, or discipline-aware qualitative judgement of an individual’s work.

    Common Questions on CiteScore vs Impact Factor

    Which is better, Impact Factor or CiteScore?

    Neither is “better” in absolute terms. CiteScore suits fields with slower citation cycles and full Scopus coverage, while Journal Impact Factor suits comparisons within Web of Science’s narrower, more selective index. Under DORA and CoARA criteria, both are inappropriate substitutes for peer review or individual-level research assessment.

    What is a good CiteScore for a journal?

    A “good” CiteScore is field-relative. Elsevier’s own guidance points assessors toward a journal’s CiteScore Percentile rather than the raw number — a title at the 90th percentile outperforms 90% of journals in its Scopus subject category, which is more meaningful than comparing raw scores across disciplines.

    Is 3.5 a good Impact Factor?

    There is no universal threshold. A 3.5 Impact Factor is strong in fields with slow, sparse citation practices but modest in fast-citing fields such as immunology or oncology. Clarivate’s Journal Citation Reports ranks journals by subject-category quartile, not by a fixed numeric cutoff, for exactly this reason.

    What is a decent CiteScore?

    Elsevier measures this through the CiteScore Percentile: a title in the 96th percentile ranks as high as, or higher than, 96% of journals in its category. Institutions applying DORA principles are advised to cite percentile standing within a discipline rather than treat any single CiteScore value as “decent” in isolation.

    Implications for Institutions and Publishers

    For research administrators, the practical takeaway is procedural, not metric-specific: audit promotion, tenure and funding criteria for language that treats CiteScore or JIF as a quality proxy, and replace it with narrative or portfolio-based evaluation where DORA or CoARA commitments apply — a shift increasingly embedded in research administration standards and workflows. For publishers, transparency about which metric — and which window — is being quoted matters more than which number is higher, since CiteScore and JIF are not interchangeable and a journal can carry a strong figure on one while looking average on the other.

    As more funders and universities formalise CoARA commitments, expect journal-level metrics to persist as directional signals in publisher marketing and library collection decisions, while disappearing — by policy, not by accident — from individual hiring, promotion and grant-review criteria.