Tag: bibliometrics

  • The h-index and Author-Level Metrics Explained

    The h-index is an author-level metric, proposed by physicist Jorge Hirsch in 2005, that attempts to capture both a researcher’s productivity and the citation impact of their work in a single number. A researcher has an h-index of h if they have published h papers that have each been cited at least h times. It is widely reported by citation databases, but its convenience hides important limitations that make it unsuitable as a standalone measure of a researcher’s worth.

    The appeal of the h-index is that it rewards a sustained body of well-cited work rather than a single highly cited paper or a long list of uncited ones. The risk is that a single integer flattens a complex career into something easily, and often wrongly, compared.

    A worked example

    To find an h-index, rank an author’s papers by citation count from highest to lowest, then find the point where the rank number still equals or is below the citation count.

    Rank Citations Rank ≤ citations?
    1 40 Yes
    2 18 Yes
    3 12 Yes
    4 9 Yes
    5 5 Yes
    6 3 No

    Here the author has five papers each cited at least five times, but the sixth paper has only three citations. The h-index is therefore 5. Note that adding more lightly cited papers would not raise the score, and a single paper with hundreds of citations could not push an h-index above the number of papers that meet the threshold.

    What the h-index captures and what it misses

    The h-index is robust to two extremes: it is not inflated by one runaway hit, nor by a large volume of uncited output. It rewards consistency. However, it deliberately ignores information. Citations above the threshold do not count, so a paper cited 40 times and a paper cited 5 times can both sit inside an h-index of 5 with no distinction between them. It also cannot decrease, which means it tends to rise with time regardless of recent activity.

    Field and career-stage limitations

    Citation behaviour varies enormously between disciplines. Fields with large communities and rapid publication accumulate citations faster than smaller or slower-moving fields, so raw h-indices cannot be compared across subjects fairly. The metric is also strongly correlated with career length, because it can only grow over time. This systematically disadvantages early-career researchers and anyone with a career break, and it says nothing about an individual’s specific role in collaborative work. These distortions are precisely the kind that the critique of journal metrics warns about, applied at the level of the person rather than the journal.

    Gaming and integrity concerns

    Because it is a citation count, the h-index can be manipulated, for instance through excessive self-citation or coordinated citation arrangements. Database coverage also affects the result: the same researcher can have different h-indices in different databases depending on what each indexes. These vulnerabilities reinforce why the metric should never be the sole basis for an evaluation, a position consistent with our responsible assessment coverage.

    Complementary metrics

    Several measures are used alongside the h-index to compensate for its blind spots.

    • i10-index. The number of an author’s publications with at least ten citations, a simple complement that is easy to interpret.
    • m-quotient. The h-index divided by the number of years since a researcher’s first publication, intended to reduce the bias towards longer careers and allow fairer comparison across career stages.
    • Total citations and field-normalised indicators. These add information that the h-index discards, including high-impact outliers and disciplinary context.

    No single number resolves all the problems, which is why frameworks such as DORA and the Leiden Manifesto, discussed in our piece on DORA and responsible research assessment, insist that quantitative indicators support rather than replace expert qualitative judgement.

    Using the h-index responsibly

    Used responsibly, the h-index is a descriptive summary, not a verdict. It should be interpreted within a discipline, adjusted for career stage, read alongside complementary metrics, and always subordinated to reading the actual work. Definitions of these author-level measures are maintained in our standards dictionary for consistent use across evaluation processes.

    Frequently asked questions

    Who created the h-index and when?

    The h-index was proposed by the physicist Jorge Hirsch in 2005 as a way to characterise both the productivity and citation impact of a researcher’s output in a single figure.

    How is the h-index calculated?

    Rank an author’s papers by citation count and find the largest number h such that h papers each have at least h citations. If five papers each have at least five citations but the sixth has fewer, the h-index is five.

    Why can’t h-indices be compared across fields?

    Citation rates differ markedly between disciplines, so researchers in fast-citing fields accumulate higher h-indices than those in smaller or slower fields, making raw cross-field comparison misleading.

    What is the m-quotient?

    The m-quotient is the h-index divided by the number of years since a researcher’s first publication. It is designed to reduce the bias towards longer careers and enable fairer comparison across career stages.

  • Responsible metrics: the Leiden Manifesto and the Metric Tide in practice

    Metrics are seductive because they are simple. A single number — a journal’s impact factor, a researcher’s h-index, a citation count — promises to compress the messy, qualitative business of judging research into something fast, comparable and apparently objective. And metrics are dangerous for exactly the same reason: their simplicity hides what they leave out, and their apparent objectivity lends unearned authority to comparisons they cannot really support. The response to this tension has not been to abolish metrics but to use them responsibly — to let quantitative indicators inform expert judgement rather than replace it. Two landmark statements from 2015, the Leiden Manifesto and The Metric Tide, set out what responsible use looks like. This article examines both and how they translate into practice, drawing on the responsible assessment domain of the CASRAI Dictionary.

    The Leiden Manifesto

    The Leiden Manifesto for research metrics, published in 2015, offers ten principles for the responsible use of quantitative indicators. Several of its themes recur throughout the responsible-metrics movement and are worth drawing out. It insists that quantitative evaluation should support, not supplant, qualitative expert assessment — metrics inform judgement; they do not make it. It warns against measuring performance against inappropriate or generic benchmarks, urging that assessment account for the mission and context of the research. It calls for transparency in the data and methods behind any indicator, so that those being assessed can understand and scrutinise how they are judged. It highlights the importance of accounting for variation between fields, since citation behaviour differs enormously across disciplines and naive comparison across them is meaningless. And it cautions against the distortions metrics produce when they become targets — the well-known problem that an indicator, once it is what people are rewarded for, stops measuring what it was meant to.

    The Metric Tide

    Published the same year, The Metric Tide was an independent review of the role of metrics in research assessment, conducted in the United Kingdom. Its central contribution was the concept of responsible metrics, defined through a set of dimensions that have become a common reference point:

    • Robustness — basing indicators on the best available, accurate data.
    • Humility — recognising that quantitative evaluation should support, not supplant, expert assessment.
    • Transparency — keeping data collection and analytical processes open to scrutiny.
    • Diversity — accounting for variation by field and using a range of indicators to reflect the plurality of research.
    • Reflexivity — recognising and anticipating the systemic effects of indicators and updating them in response.

    The review was notably sceptical of reducing assessment to single numbers and emphasised that metrics work best as a complement to peer review, not a substitute for it. Its framing of responsible metrics as a set of dimensions to be designed for, rather than a checklist to be passed, has proved durable.

    What the two have in common

    Read together, the Leiden Manifesto and The Metric Tide converge on a consistent message. Metrics are useful but partial; they must be transparent so they can be questioned; they must respect disciplinary difference; they must be used with humility alongside expert judgement; and their users must stay alert to the behaviour they induce, because any metric that becomes a target will eventually be gamed or will distort the work it was meant to measure. Neither document is anti-metric. Both are against the misuse of metrics — against the false precision of a single number standing in for a considered judgement about the quality and significance of research.

    From principle to practice

    Translating these principles into institutional practice means concrete commitments: assessing research on its own merits rather than on the prestige of its publication venue, using a basket of indicators rather than any single one, being transparent about what is measured and how, contextualising comparisons by field and career stage, and keeping expert peer judgement at the centre with metrics in a supporting role. These commitments connect directly to the broader assessment-reform movement. The principle of not judging research by where it is published is the heart of the comparison in our DORA versus CoARA overview, while the specific hazards of the two most over-used single numbers are examined in our look at the journal impact factor versus the h-index. Responsible metrics is the methodological backbone these reform initiatives share.

    Metrics and the recognition of contribution

    One reason single-number metrics mislead is that they obscure who actually did the work and what they did. A citation count attaches to a paper, not to the distinct contributions of the people who made it. Structured contributorship through the CRediT taxonomy — whose full set of roles is described in our overview of the CRediT roles — offers a more granular and honest picture of contribution than any aggregate metric can, and is a natural complement to responsible assessment: it supports judging people on what they genuinely contributed rather than on a number that flattens it. The consistent vocabulary that lets assessment frameworks, indicators and contribution records be described and exchanged the same way across systems is maintained in the CASRAI Dictionary, helping ensure that responsible metrics rests on a shared and well-defined foundation.

  • The Journal Impact Factor: Meaning and Critique

    The impact factor, properly the Journal Impact Factor (JIF), is a bibliometric measure of how often, on average, articles in a journal are cited within a defined recent period. It is published annually by Clarivate in the Journal Citation Reports (JCR) and is one of the most influential, and most contested, numbers in scholarly publishing. Understanding what it actually measures is the first step to using it responsibly.

    In short: the impact factor is a property of a journal, calculated from citation averages, and it was never designed to evaluate individual articles or researchers.

    How the impact factor is calculated

    The standard two-year Journal Impact Factor for a given year is a ratio. The numerator is the number of citations received in that year to items the journal published in the two preceding years. The denominator is the number of citable items, typically research articles and reviews, published in those same two years.

    Component What it counts
    Numerator Citations in the current year to articles from the previous two years
    Denominator Number of citable articles published in those two years
    Result Average citations per citable article over the two-year window

    A journal with a JIF of 5 received, on average, five citations per citable article published in the preceding two years. Clarivate also publishes a five-year variant that widens the citation window for fields where impact accrues more slowly.

    What the impact factor does and does not measure

    The JIF captures the recent average citation rate of a journal’s body of work. It can offer a rough sense of how actively a journal’s content is being cited within a couple of years of publication. What it does not measure is the quality, rigour or importance of any single article, because citations within a journal are highly skewed: a small number of heavily cited papers can pull the average up while most articles receive far fewer citations. The mean is therefore a poor predictor of any individual paper’s citations.

    Other well-known limitations include differences in citation culture between disciplines, which make cross-field comparison misleading, and the inclusion or exclusion decisions about what counts as a citable item, which can affect the denominator.

    Garfield’s own caveats

    Eugene Garfield, who originated the impact factor as a tool for selecting journals for indexing, repeatedly cautioned against misusing it. He noted that the metric was intended to help librarians and editors compare journals, not to judge the worth of individual scientists, and warned that the skewed distribution of citations meant the journal average should not be read across to the papers it contains. The critique of the metric is therefore not an external attack so much as a return to its creator’s own warnings.

    Why DORA and the Leiden Manifesto warn against misuse

    Two landmark statements formalise these concerns. The San Francisco Declaration on Research Assessment (DORA) recommends not using journal-based metrics such as the JIF as a surrogate measure of the quality of individual research articles, or to assess an individual scientist’s contributions for hiring, promotion or funding decisions. The Leiden Manifesto for research metrics sets out ten principles for the responsible use of quantitative indicators, including that quantitative evaluation should support, not replace, expert qualitative assessment, and that metrics should be measured against the research mission rather than imposed generically.

    These principles are central to the wider shift covered in our responsible assessment coverage, and they sit alongside author-level measures discussed in our standards dictionary. The common thread is simple: a journal-level average should never stand in for reading the work itself.

    Using the impact factor sensibly

    Responsible use means treating the JIF as one descriptive feature of a journal rather than a verdict on the papers within it or the people who wrote them. Where citation context matters, field-normalised indicators and a basket of complementary metrics are more defensible than a single number. Crucially, evaluation of individuals should rest on the content and contribution of their outputs, an approach reinforced across our guidance for authors.

    Frequently asked questions

    What citation window does the standard impact factor use?

    The standard Journal Impact Factor uses a two-year window: it counts citations in the current year to articles the journal published in the previous two years, divided by the number of citable items in those two years.

    Who calculates and publishes the impact factor?

    The Journal Impact Factor is calculated and published annually by Clarivate in the Journal Citation Reports, based on its citation database.

    Why is it wrong to judge a researcher by journal impact factors?

    Because citations within any journal are highly skewed, the journal average does not predict an individual article’s citations. DORA explicitly recommends against using journal-based metrics as a proxy for the quality of individual articles or researchers.

    What should be used instead?

    Responsible assessment favours reading the work, supported where appropriate by article-level and field-normalised indicators and expert qualitative judgement, as set out in DORA and the Leiden Manifesto.

  • Web of Science: What It Indexes and How It Works

    Web of Science is a curated, selective citation-indexing platform operated by Clarivate that records scholarly publications and the citation links between them, enabling researchers to trace how ideas connect across the literature. Rather than indexing everything it can find, Web of Science applies editorial selection criteria, and its data underpins widely used research metrics including those published in the Journal Citation Reports.

    This article explains what the Web of Science Core Collection contains, how citation indexing works, its relationship to the Journal Impact Factor, and why a citation index is fundamentally different from a general search engine.

    The Core Collection and selective indexing

    At the heart of Web of Science is the Core Collection, a set of citation indexes covering the sciences, social sciences and arts and humanities, together with conference proceedings and book content. The defining characteristic is selectivity: journals are evaluated against editorial and quality criteria before being accepted, and coverage is curated rather than exhaustive. The intention is that the corpus represents influential, well-edited scholarly literature, so that the citation relationships drawn from it are meaningful.

    This selectivity is the central trade-off of the platform. A narrower, vetted corpus yields cleaner citation data, but it also means many legitimate outputs — particularly in regions, languages or fields with less established journals — may fall outside coverage. Understanding what is and is not indexed is essential before using the data for any kind of assessment.

    The citation index: Garfield’s idea

    The conceptual foundation of Web of Science is the citation index, an idea developed by Eugene Garfield, who founded the Institute for Scientific Information. The insight was simple but powerful: by systematically recording which papers cite which other papers, you create a navigable network of the literature. From any article you can move backwards to the references it cites and forwards to the later papers that cite it.

    This forward-and-backward navigation is what distinguishes a citation index from a bibliographic list. It lets researchers follow the development of an idea over time, identify foundational works, and gauge the influence of a paper by the citations it accrues. The same citation graph is the raw material from which bibliometric indicators are computed.

    The Journal Citation Reports and the Impact Factor

    Web of Science citation data feeds the Journal Citation Reports (JCR), Clarivate’s annual analysis of journal-level citation performance. The JCR is the source of the well-known Journal Impact Factor, a journal-level metric calculated from citation counts to a journal’s recent articles. Because the Impact Factor is derived from Web of Science data, a journal must be indexed in the relevant part of the Core Collection to receive one.

    Element What it is
    Core Collection The curated set of citation indexes underpinning the platform
    Citation index The network of citing–cited relationships between publications
    Journal Citation Reports Annual journal-level citation analysis built on the data
    Journal Impact Factor A journal-level metric published within the JCR

    It is important to stress that the Impact Factor is a journal-level measure and is widely cautioned against as a proxy for the quality of any individual article or researcher. Responsible-metrics initiatives encourage using it carefully and in context.

    How it differs from a search engine

    A general web search engine indexes pages it can crawl and ranks them by relevance and popularity signals. Web of Science is different in three respects: its corpus is selected rather than crawled; its core data structure is the citation graph rather than full-text relevance; and its records are structured bibliographic metadata — authors, affiliations, references, funding — rather than raw web content. This makes it a tool for analysis and discovery within the scholarly record, not a general-purpose finder of web pages. Related tools and systems are covered across our research information systems section.

    Web of Science is frequently compared with Elsevier’s Scopus, the other large multidisciplinary citation database; we set the two side by side in our Scopus versus Web of Science comparison. Both rely on persistent identifiers such as the DOI to link records reliably, and definitions of the metrics involved appear in the CASRAI dictionary.

    Frequently asked questions

    Is Web of Science free to use?

    No. Web of Science is a subscription product from Clarivate, typically licensed by universities, research institutions and libraries. Access depends on your organisation’s subscription.

    Does being in Web of Science mean a journal is high quality?

    Inclusion signals that a journal met the platform’s selection criteria, which is a meaningful editorial threshold. It is not, however, an absolute or universal measure of quality, and many reputable journals sit outside its coverage.

    What is the difference between Web of Science and the Journal Citation Reports?

    Web of Science is the underlying citation database; the Journal Citation Reports is an annual analytical product built from that data, and it is where the Journal Impact Factor is published.

    Who invented the citation index?

    The citation-index concept was developed by Eugene Garfield, founder of the Institute for Scientific Information, whose work established the systematic recording of citation links that Web of Science still embodies.