Life sciences · Reference

What is DNA sequencing?

DNA sequencing is the process of determining the precise order of the four bases — adenine, thymine, guanine, and cytosine — along a strand of DNA, reading the genetic code written in a sample.

Reading the order of bases

The information in DNA is encoded in the order of its four bases, so determining that order — sequencing — is fundamental to understanding genes and genomes. DNA sequencing produces a readout of the base sequence of a sample, which researchers can then analyse, compare, and annotate. From a short gene to an entire genome, sequencing turns a physical DNA molecule into digital sequence data that can be stored and shared.

Sanger sequencing

The first widely used method was Sanger sequencing, developed by Frederick Sanger and colleagues in the 1970s, for which Sanger received a share of the 1980 Nobel Prize in Chemistry.

Sanger sequencing reads one DNA fragment at a time with high accuracy and was the workhorse behind early genome projects, including much of the Human Genome Project. It remains valued today for verifying short sequences with precision.

Next-generation sequencing

From the mid-2000s, next-generation sequencing (NGS) transformed the field by reading millions of DNA fragments in parallel. Platforms such as those developed by Illumina dramatically cut the cost and time of sequencing, making whole-genome and large-scale studies routine. The resulting flood of data is assembled and interpreted using bioinformatics, which is essential to making sense of NGS output.

Applications and data standards

DNA sequencing underpins genomics, evolutionary biology, microbiology, and research into genetic variation. Because sequencing generates very large datasets that are widely reused, the community curates them in repositories such as the European Nucleotide Archive and GenBank under shared formats and metadata standards, supporting findable, interoperable, and reusable data.

Key facts

At a glance

Definition: determining the order of DNA bases
Bases read: A, T, G, C
Sanger sequencing: developed in the 1970s (Nobel Prize 1980)
Next-generation sequencing: massively parallel, from mid-2000s
Key NGS platform: Illumina
Analysis: depends on bioinformatics

Common questions

FAQ

What is DNA sequencing?+

DNA sequencing is the process of determining the exact order of the bases — adenine, thymine, guanine, and cytosine — in a DNA molecule. It converts a physical DNA sample into readable sequence data for analysis.

What is the difference between Sanger and next-generation sequencing?+

Sanger sequencing reads one DNA fragment at a time with high accuracy and was used for early genome projects. Next-generation sequencing reads millions of fragments in parallel, making large-scale sequencing far faster and cheaper.

Going deeper

Related on CASRAI

Sources

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.