Backpropagation

Fundamentals

An algorithm for computing gradients of the loss function with respect to each weight in a neural network by applying the chain rule layer by layer.

Backpropagation (short for "backward propagation of errors") is the algorithm that makes training deep neural networks computationally feasible. It efficiently computes the gradient of the loss function with respect to every weight in the network by applying the chain rule of calculus from the output layer back through each hidden layer to the input.

During a training step, the forward pass computes the network's prediction, and the loss function measures the error. Backpropagation then works backward: it calculates how much each weight contributed to the error by propagating gradients through the layers. These gradients are then used by an optimization algorithm like gradient descent to update the weights.

Backpropagation was popularized in the 1986 paper by Rumelhart, Hinton, and Williams, and it remains the primary method for training neural networks today. Modern deep learning frameworks like PyTorch and TensorFlow implement automatic differentiation, which generalizes backpropagation and handles gradient computation transparently for arbitrary computational graphs.

Last updated: February 20, 2026

Backpropagation

Related Terms