When researchers think about persistent identifiers, they usually picture DOIs on papers and datasets or ORCID iDs on people. Yet a great deal of research turns on physical things: a sediment core drilled from a lake bed, a tissue specimen in a biobank, a water sample from a particular depth on a particular day, or the spectrometer that analysed it. These physical research objects have historically been referred to by inconsistent local labels, if they were referred to at all. Two complementary efforts, the IGSN for samples and the PIDINST work for instruments, set out to give them stable, global identifiers.
Why physical objects need PIDs
The case for identifying physical objects mirrors the case for identifying any research output. A persistent identifier lets a sample or instrument be referred to unambiguously across publications, datasets, and laboratories. It allows the measurements derived from a sample to be linked back to the sample itself, and onward to the instrument that produced them. Without such links, reuse and verification become difficult: a reader cannot easily tell whether two studies analysed the same specimen, or whether a calibration problem on a particular instrument might affect a body of results. Persistent identification turns scattered physical objects into nodes in a connected research graph, supporting the goals of FAIR data.
IGSN: identifiers for samples
The IGSN began in the geosciences as the International Geo Sample Number, a way to give individual physical samples a globally unique identifier so that specimens could be tracked and cited across the literature. As the approach proved useful beyond geology, the system evolved. The IGSN is now implemented as an IGSN ID, issued through DataCite, which brought sample identification into the same DOI-based infrastructure used for datasets and other outputs. This alignment means a sample can carry a resolvable identifier, a landing page, and structured metadata describing what the sample is, where and when it was collected, and how it relates to other objects.
The practical effect is that a physical specimen becomes a citable entity. A paper can reference the exact sample it analysed; a dataset can link each measurement to the sample it came from; and a repository can expose the provenance of its holdings. For disciplines that depend on irreplaceable physical material, from earth science to the life sciences, this is a meaningful advance in traceability.
PIDINST: identifiers for instruments
Where IGSN addresses samples, the PIDINST working group, convened under the Research Data Alliance, addressed the instruments themselves. The group developed a metadata schema for persistent identification of measuring instruments, so that a microscope, sensor, telescope, or analytical device can be referenced by a persistent identifier and described in a consistent way. The schema captures the kind of information that makes an instrument identifiable and useful to cite: what it is, who owns or operates it, its model and configuration, and identifiers for related entities such as the institution that hosts it.
Identifying instruments matters because the measuring apparatus is part of the methods. When the data from an experiment can be linked to the specific instrument that produced them, it becomes possible to assess instrument-related effects, to credit the facilities that maintain expensive equipment, and to trace a result from a published figure all the way back to the device on a laboratory bench.
Connecting the chain of provenance
The real power of these identifiers appears when they are used together. Imagine a measurement linked to the instrument that produced it via a PIDINST identifier, the sample it was taken from via an IGSN ID, the dataset it belongs to via a DataCite DOI, and the researchers responsible via their ORCID iDs. Each link is a small piece of metadata, but together they describe an unbroken chain of provenance from a published claim back to the physical objects and people behind it. That is precisely the kind of connected, machine-actionable record that modern research infrastructure aspires to.
Towards a fully identified research record
Extending persistent identification to samples and instruments fills two of the larger gaps in the research record. Articles, data, organisations, and people increasingly carry stable identifiers; physical objects and the apparatus that measures them have lagged behind. By bringing samples into the DataCite ecosystem as IGSN IDs and by giving instruments a shared metadata schema through PIDINST, the community is steadily closing those gaps. The vocabularies and crosswalks that hold such a record together are the kind of standards work catalogued in the CASRAI data dictionary, and they complement contributor frameworks such as CRediT by anchoring the human contributions to the physical things they acted upon.







