Back to References


Norms and Vector Spaces


Learning Objectives

Vector Spaces

A vector space is a set \(V\) of vectors and a field \(F\) (elements of F are called scalars) with the following two operations:

  1. Vector addition: \(\forall \mathbf{v},\mathbf{w} \in V\), \(\mathbf{v} + \mathbf{w} \in V\)
  2. Scalar multiplication: \(\forall \alpha \in F, \mathbf{v} \in V\), \(\alpha \mathbf{v} \in V\)

which satisfiy the following conditions:

  1. Associativity (vector): \(\forall \mathbf{u}, \mathbf{v}, \mathbf{w} \in V\), \((\mathbf{u} + \mathbf{v}) + \mathbf{w} = \mathbf{u} + (\mathbf{v}+\mathbf{w})\)
  2. Zero vector: There exists a vector \(\mathbf{0} \in V\) such that \(\forall \mathbf{u} \in V, \mathbf{0} + \mathbf{u} = \mathbf{u}\)
  3. Additive inverse (negatives): For every \(\mathbf{u} \in V\), there exists \(\mathbf{-u} \in V\), such that \(\mathbf{u} + \mathbf{-u} = \mathbf{0}\).
  4. Associativity (scalar): \(\forall \alpha, \beta \in F, \mathbf{u} \in V\), \((\alpha \beta) \mathbf{u} = \alpha (\beta \mathbf{u})\)
  5. Distributivity: \(\forall \alpha, \beta \in F, \mathbf{u} \in V\), \((\alpha + \beta) \mathbf{u} = \alpha \mathbf{u} + \beta \mathbf{u}\)
  6. Unitarity: \(\forall \mathbf{u} \in V\), \(1 \mathbf{u} = \mathbf{u}\)

One example of a vector space is \(V=\mathbb{R}^2\) with \(F=\mathbb{R}\).

Inner Product

Let \(V\) be a real vector space. Then, an inner product is a function \(\langle\cdot, \cdot \rangle: V \times V \rightarrow \mathbb{R}\) (i.e., it takes two vectors and returns a real number) which satisfies the following four properties, where \(\mathbf{u}, \mathbf{v}, \mathbf{w} \in V\) and \(\alpha, \beta \in \mathbb{R}\):

  1. Positivity: \(\langle \mathbf{u}, \mathbf{u} \rangle \geq 0\)
  2. Definiteness: \(\langle \mathbf{u}, \mathbf{u} \rangle = 0\) if and only if \(\mathbf{u} = 0\)
  3. Symmetric: \(\langle \mathbf{u}, \mathbf{v} \rangle = \langle \mathbf{v}, \mathbf{u} \rangle\)
  4. Linearity: \(\langle \alpha \mathbf{u} + \beta \mathbf{v}, \mathbf{w} \rangle = \alpha \langle \mathbf{u}, w \rangle + \beta \langle \mathbf{v}, \mathbf{w} \rangle\)

The inner product intuitively represents the similarity between two vectors. Two vectors \(\mathbf{u}, \mathbf{v} \in V\) are said to be orthogonal if \(\langle \mathbf{u}, \mathbf{v} \rangle = 0\).

Vector Norm

A vector norm is a function \(\| \mathbf{u} \|: V \rightarrow \mathbb{R}^+_0\) (i.e., it takes a vector and returns a nonnegative real number) that satisfies the following properties, where \(\mathbf{u}, \mathbf{v} \in V\) and \(\alpha \in \mathbb{R}\):

  1. Positivity: \(\| \mathbf{u}\| \geq 0\)
  2. Definiteness: \(\|\mathbf{u}\| = 0\) if and only if \(\mathbf{u} = \mathbf{0}\)
  3. Homogeneity: \(\|\alpha \mathbf{u}\| = |\alpha| \|\mathbf{u}\|\)
  4. Triangle inequality: \(\|\mathbf{u} + \mathbf{v}\| \leq \|\mathbf{u}\| + \|\mathbf{v}\|\)

A norm is a generalization of "absolute value" and measures the "magnitude" of the input vector.

The p-norm

The p-norm is defined as \(\|\mathbf{w}\|_p = (\sum_{i=1}^N |w_i|^p)^{\frac{1}{p}}\). The definition is a valid norm when \(p \geq 1\). If \(0 \leq p \lt 1\) then it is not a valid norm because it violates the triangle inequality.

When \(p=2\) (2-norm), this is called the Euclidean norm and it corresponds to the length of the vector.

Vector Norm Examples

Consider the case of \(\mathbf{w} = [-3, 5, 0, 1]\), in this part we will show how to calculate the 1, 2, and \(\infty\) norm of \(\mathbf{w}\).

For the 1-norm:

\(\|\mathbf{w}\|_1 = (\sum_{i=1}^N |w_i|^1)^{\frac{1}{1}}\)

\(\|\mathbf{w}\|_1 = \sum_{i=1}^N |w_i|\)

\(\|\mathbf{w}\|_1 = |-3| + |5| + |0| + |1|\)

\(\|\mathbf{w}\|_1 = 3 + 5 + 0 + 1\)

\(\|\mathbf{w}\|_1 = 9\)

For the 2-norm:

\(\|\mathbf{w}\|_2 = (\sum_{i=1}^N |w_i|^2)^{\frac{1}{2}}\)

\(\|\mathbf{w}\|_2 = \sqrt{\sum_{i=1}^N w_i^2}\)

\(\|\mathbf{w}\|_2 = \sqrt{(-3)^2 + (5)^2 + (0)^2 + (1)^2}\)

\(\|\mathbf{w}\|_2 = \sqrt{9 + 25 + 0 + 1}\)

\(\|\mathbf{w}\|_2 = \sqrt{35} \approx 5.92\)

For the \(\infty\)-norm:

\(\|\mathbf{w}\|_\infty = \lim_{p\to\infty}(\sum_{i=1}^N |w_i|^p)^{\frac{1}{p}}\)

\(\|\mathbf{w}\|_\infty = \max_{i=1,\dots,N} |w_i|\)

\(\|\mathbf{w}\|_\infty = \max(|-3|, |5|, |0|, |1|)\)

\(\|\mathbf{w}\|_\infty = \max(3, 5, 0, 1)\)

\(\|\mathbf{w}\|_\infty = 5\)

Matrix Norm

A general matrix norm is a real valued function \(\| A \|\) that satisfies the following properties:

  1. \(\|A\| \geq 0\)
  2. \(\|A\| = 0\) if and only if \(A = 0\)
  3. \(\|\lambda A\| = |\lambda| \|A\|\) for all scalars \(\lambda\)
  4. \(\|A + B\| \leq \|A\| + \|B\|\) (triangle inequality)

Induced (or operator) matrix norms are associated with a specific vector norm \(\| \cdot \|\) and are defined as:

\[\|A\| := \max_{\|\mathbf{x}\|=1} \|A\mathbf{x}\|.\]

An induced matrix norm is a particular type of a general matrix norm. Induced matrix norms tell us the maximum amplification of the norm of any vector when multiplied by the matrix. Note that the definition above is equivalent to

\[\|A\| = \max_{\|\mathbf{x}\| \neq 0} \frac{\| A \mathbf{x}\|}{\|x\|}.\]

In addition to the properties above of general matrix norms, induced matrix norms also satisfy the submultiplicative conditions:

  1. \(\| A \mathbf{x} \| \leq \|A\| \|\mathbf{x}\|\)
  2. \(\|A B\| \leq \|A\| \|B\|\)

Frobenius norm

The Frobenius norm is simply the sum of every element of the matrix squared, which is equivalent to applying the vector \(2\)-norm to the flattened matrix, \[\|A\|_F = \sqrt{\sum_{i,j} a_{ij}^2}.\]

The Frobenius norm is an example of a general matrix norm that is not an induced norm.

The matrix p-norm

The matrix p-norm is induced by the p-norm of a vector. It is \(\|A\|_p := \max_{\|\mathbf{x}\|_p=1} \|A\mathbf{x}\|_p\). There are three special cases. For the 1-norm, this reduces to the maximum absolute column sum of the matrix. For the 2-norm, this reduces the maximum singular value of the matrix. For the \(\infty\)-norm this reduces to the maximum absolute row sum of the matrix.

Matrix Norm Examples

Now we will go through a few examples with a matrix \(C\), defined below.

\[ C = \begin{bmatrix} 3 & -2 \\ -1 & 3 \\ \end{bmatrix} \]

For the 1-norm:

\(\|C\|_1 = \max_{\|\mathbf{x}\|_1=1} \|C\mathbf{x}\|_1\)

\(\|C\|_1 = \max_{1 \leq j \leq 3} \sum_{i=1}^3|C_{ij}|\)

\(\|C\|_1 = \max(|3| + |-1|, |-2| + |3|)\)

\(\|C\|_1 = \max(4, 5)\)

\(\|C\|_1 = 5\)

For the 2-norm:

The singular values are the square roots of the eigenvalues of the matrix \(C^T C\). You can also find the maximum singular values by calculating the SVD of the matrix.

\(\|C\|_2 = \max_{\|\mathbf{x}\|_2=1} \|C\mathbf{x}\|_2\)

\(det(C^T C - \lambda I) = 0\)

\[ det( \begin{bmatrix} 3 & -1 \\ -2 & 3 \\ \end{bmatrix} \begin{bmatrix} 3 & -2 \\ -1 & 3 \\ \end{bmatrix} - \lambda I) = 0 \]

\[ det( \begin{bmatrix} 9+1 & -6-3 \\ -3-6 & 4+9 \\ \end{bmatrix} - \lambda I) = 0 \]

\[ det( \begin{bmatrix} 10 - \lambda & -9 \\ -9 & 13 - \lambda \\ \end{bmatrix} ) = 0 \]

\((10-\lambda)(13-\lambda) - 81 = 0\)

\(\lambda^2 - 23\lambda + 130 - 81 = 0\)

\(\lambda^2 - 23\lambda + 49 = 0\)

\((\lambda-\frac{1}{2}(23+3\sqrt{37}))(\lambda-\frac{1}{2}(23-3\sqrt{37})) = 0\)

\(\|C\|_2 = \sqrt{\lambda_{max}} = \sqrt{\frac{1}{2}(23+3\sqrt{37})} \approx 4.54\)

For the \(\infty\)-norm:

\(\|C\|_{\infty} = \max_{\|\mathbf{x}\|_{\infty}=1} \|C\mathbf{x}\|_{\infty}\)

\(\|C\|_{\infty} = \max_{1 \leq i \leq 3} \sum_{j=1}^3|C_{ij}|\)

\(\|C\|_{\infty} = \max(|3| + |-2|, |-1| + |3|)\)

\(\|C\|_{\infty} = \max(5, 4)\)

\(\|C\|_{\infty} = 5\)

Review Questions

  1. What is a vector space?
  2. What is an inner product?
  3. Given a specific function \(f(\mathbf{x})\), can \(f(\mathbf{x})\) be considered an inner product?
  4. What is a vector norm? (What properties must hold for a function to be a vector norm?)
  5. Given a specific function \(f(\mathbf{x})\), can \(f(\mathbf{x})\) be considered a norm?
  6. What is the definition of an induced matrix norm? What do they measure?
  7. What properties do induced matrix norms satisfy? Which ones are the submultiplicative properties? Be able to apply all of these properties.
  8. For an induced matrix norm, given \(\|\mathbf{x}\|\) and \(\|A\mathbf{x}\|\) for a few vectors, can you determine a lower bound on \(\|A\|\)?
  9. What is the Frobenius matrix norm?
  10. For a given vector, compute the 1, 2 and \(\infty\) norm of the vector.
  11. For a given matrix, compute the 1, 2 and \(\infty\) norm of the matrix.
  12. Know what the norms of special matrices are (e.g., norm of diagonal matrix, orthogonal matrix, etc.)

ChangeLog