Data science & AI · Reference

What is machine learning?

Machine learning is the branch of artificial intelligence in which computer systems improve at a task by learning patterns from data, rather than being explicitly programmed with fixed rules for every case.

How machine learning works

A machine-learning system is given a model with adjustable parameters and a measure of error (a loss function). During training, an optimisation procedure — often gradient descent — repeatedly adjusts the parameters to reduce error on the training data. The fitted model is then tested on held-out data to estimate generalisation: its performance on examples it has not seen. A central concern is the balance between underfitting (too simple to capture the pattern) and overfitting (memorising noise in the training data).

Types of machine learning

Supervised learning trains on labelled examples to predict a label — classification for categories, regression for continuous values. Unsupervised learning works with unlabelled data to find structure, such as clusters or a lower-dimensional representation.

Reinforcement learning trains an agent to choose actions that maximise a cumulative reward through trial and error in an environment. Many practical systems combine paradigms, and supervised versus unsupervised learning is the most common first distinction taught.

Relation to AI and deep learning

Machine learning is one approach to artificial intelligence: it pursues intelligent behaviour by learning from data rather than encoding rules by hand. Deep learning is in turn a subfield of machine learning that uses multi-layer neural networks. The nesting is often summarised as AI ⊃ machine learning ⊃ deep learning, with each inner field a more specific way of achieving the broader goal.

Machine learning in research

In research, machine learning serves both as a method of analysis and as an object of study. Credible results depend on disciplined practice: separating training, validation, and test sets; avoiding data leakage; reporting appropriate metrics; and comparing against baselines. Reproducibility requires documenting data provenance, preprocessing, hyperparameters, and random seeds. Because models can encode biases present in their training data, careful evaluation across subgroups is part of responsible methodology.

Key facts

At a glance

Field: subfield of artificial intelligence
Core idea: learn patterns from data, not explicit rules
Paradigms: supervised, unsupervised, reinforcement learning
Training: adjust parameters to minimise a loss function
Key risk: overfitting (poor generalisation to new data)
Term coined: Arthur Samuel, 1959

Common questions

FAQ

What are the main types of machine learning?+

The three main paradigms are supervised learning (from labelled data), unsupervised learning (finding structure in unlabelled data), and reinforcement learning (learning actions from reward feedback). Many real systems combine elements of these.

What is overfitting?+

Overfitting occurs when a model learns noise and idiosyncrasies of the training data rather than the underlying pattern, so it performs well on training examples but poorly on new data. It is detected by comparing performance on a held-out test set.

Is machine learning the same as AI?+

No. Machine learning is one approach to artificial intelligence, in which behaviour is learned from data. AI is the broader goal and also includes rule-based and symbolic methods that do not learn from data.

Going deeper

Related on CASRAI

Sources

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.