Data science & AI · Reference

What is a recurrent neural network?

A recurrent neural network is a type of neural network with connections that loop back, giving it a form of memory that lets it process sequential data such as text, speech, or time series one element at a time.

Networks with memory

A recurrent neural network processes a sequence one element at a time, maintaining a hidden state that carries information from earlier steps forward. Because connections loop back, the network's output at each step depends not only on the current input but on what it has seen before — a form of memory. This makes RNNs naturally suited to sequential data, where order matters: language, audio, and time series. It contrasts with feedforward networks, which treat each input independently.

The vanishing-gradient problem

Plain RNNs are hard to train on long sequences because of the vanishing-gradient problem: as the training signal is propagated back through many steps, it tends to shrink toward zero, so the network struggles to learn dependencies between distant elements.

This limits how far back a simple RNN can effectively remember, which motivated architectures specifically designed to preserve information over longer spans.

LSTM and GRU

To address vanishing gradients, gated variants were developed. The long short-term memory (LSTM) network, introduced by Hochreiter and Schmidhuber in 1997, uses gates to control what information is kept, updated, or discarded, allowing it to retain context over longer sequences. The gated recurrent unit (GRU) is a later, simpler gated design with similar aims. These became the standard recurrent architectures for sequence tasks until attention-based models emerged.

RNNs in research and the shift to transformers

RNNs and their gated variants drove progress in language processing and speech for years. They have since been largely superseded for many tasks by the transformer, which processes sequences in parallel and handles long-range dependencies more effectively. RNNs remain relevant for some streaming and time-series problems and are studied as part of the history of sequence modelling. As with any deep model, reproducible results require reporting the architecture and training details.

Key facts

At a glance

Definition: neural network with looping (recurrent) connections
Designed for: sequential data (text, speech, time series)
Key feature: a hidden state acting as memory
Main difficulty: the vanishing-gradient problem
Gated variants: LSTM (Hochreiter & Schmidhuber, 1997) and GRU
Largely superseded for many tasks by transformers

Common questions

FAQ

What is a recurrent neural network used for?+

RNNs process sequential data where order matters — text, speech, and time series. Their looping connections give them a memory of earlier inputs, making them suited to tasks such as language modelling, speech recognition, and forecasting.

What is the vanishing-gradient problem?+

It is the tendency of the training signal to shrink toward zero as it is propagated back through many time steps, so a plain RNN struggles to learn relationships between distant elements of a sequence. Gated architectures such as the LSTM were designed to mitigate it.

What is the difference between an RNN and a transformer?+

An RNN reads a sequence step by step, carrying a hidden state forward. A transformer processes all positions in parallel using self-attention, which trains more efficiently and handles long-range dependencies better. Transformers have largely replaced RNNs for many tasks.

Going deeper

Related on CASRAI

Sources

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.