Question 1

What is a norm in linear algebra?

Accepted Answer

A norm is a function that assigns a non-negative length or size to a vector. It must satisfy three properties: (1) non-negativity (zero only for the zero vector), (2) scalar multiplication (scaling a vector scales its norm), and (3) the triangle inequality (the norm of a sum is at most the sum of the norms). Different norms measure size differently, leading to different notions of distance and geometry.

Question 2

What is the difference between L1 and L2 norm?

Accepted Answer

The L1 norm sums absolute values: ||x||_1 = |x_1| + |x_2| + ... + |x_n|. The L2 norm takes the square root of squared values: ||x||_2 = sqrt(x_1^2 + x_2^2 + ... + x_n^2). L1 corresponds to Manhattan distance (walking along a grid), L2 to Euclidean distance (straight line). The L1 unit ball is a diamond, the L2 unit ball is a circle.

Question 3

What is the Lp norm?

Accepted Answer

The Lp norm generalizes L1 and L2: ||x||_p = (|x_1|^p + |x_2|^p + ... + |x_n|^p)^(1/p) for p >= 1. At p=1 you get L1 (diamond unit ball), at p=2 you get L2 (circular unit ball), and as p approaches infinity you get the L-infinity norm (square unit ball, max of absolute values). The unit ball shape smoothly morphs between these extremes.

Question 4

Why does the L1 norm encourage sparsity?

Accepted Answer

The L1 unit ball is a diamond (cross-polytope) with sharp corners that sit on the coordinate axes. When you constrain an optimization to stay within the L1 ball, solutions tend to land at these corners, where one or more coordinates are exactly zero. This geometric property is why L1 regularization (Lasso) produces sparse models.

p = 1	$\lVert \mathbf{x} \rVert_1 = \|x_1\| + \|x_2\| + \cdots + \|x_n\|$. Sum of absolute values.
p = 2	$\lVert \mathbf{x} \rVert_2 = \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2}$. The familiar Euclidean length.
p = $\infty$	$\lVert \mathbf{x} \rVert_\infty = \max(\|x_1\|, \|x_2\|, \ldots, \|x_n\|)$. The largest component.

Norms: Measuring Size and Distance

The Lp norm

Manhattan vs Euclidean distance

The unit ball

Notation used in this series

/ FAQ

What is a norm in linear algebra?

What is the difference between L1 and L2 norm?

What is the Lp norm?

Why does the L1 norm encourage sparsity?