Singular Value Decompositions

Learning Objectives

Construct an SVD of a matrix
Identify pieces of an SVD
Use an SVD to solve a problem

Singular Value Decomposition

An $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ real matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ has a singular value decomposition of the form

A = U Σ V T <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><mi>T</mi></msup></math>

where

$U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ is an $m \times m <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>m</mi></math>$ orthogonal matrix whose columns are eigenvectors of $A A T <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup></math>$ . The columns of $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ are called the left singular vectors of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ .
$Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ is an $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ diagonal matrix of the form:

where $s = min (m, n) <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>s</mi><mo>=</mo><mo data-mjx-texclass="OP" movablelimits="true">min</mo><mo stretchy="false">(</mo><mi>m</mi><mo>,</mo><mi>n</mi><mo stretchy="false">)</mo></math>$ and $σ 1 \geq σ 2 \dots \geq σ s \geq 0 <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>σ</mi><mn>1</mn></msub><mo>\geq</mo><msub><mi>σ</mi><mn>2</mn></msub><mo>\dots</mo><mo>\geq</mo><msub><mi>σ</mi><mi>s</mi></msub><mo>\geq</mo><mn>0</mn></math>$ are the square roots of the eigenvalues values of $A T A <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ . The diagonal entries are called the singular values of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ .

Note that if $A T x \neq 0 <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">x</mi></mrow><mo>\neq</mo><mn>0</mn></math>$ , then $A T A <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ and $A A T <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup></math>$ both have the same eigenvalues:

A A T x = λ x <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">x</mi></mrow><mo>=</mo><mi>λ</mi><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">x</mi></mrow></math>

$<math xmlns="http://www.w3.org/1998/Math/MathML"><mspace width="13cm"></mspace></math>$ (left-multiply both sides by $A T <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup></math>$ )

A T A A T x = A T λ x <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">x</mi></mrow><mo>=</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mi>λ</mi><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">x</mi></mrow></math>

A T A (A T x) = λ (A T x) <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo stretchy="false">(</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">x</mi></mrow><mo stretchy="false">)</mo><mo>=</mo><mi>λ</mi><mo stretchy="false">(</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">x</mi></mrow><mo stretchy="false">)</mo></math>

$V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ is an $n \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi><mo>\times</mo><mi>n</mi></math>$ orthogonal matrix whose columns are eigenvectors of $A T A <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ The columns of $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ are called the right singular vectors of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ .

Time Complexity

The time-complexity for computing the SVD factorization of an arbitrary $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ matrix is proportional to $m 2 n + n 3 <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mi>m</mi><mn>2</mn></msup><mi>n</mi><mo>+</mo><msup><mi>n</mi><mn>3</mn></msup></math>$ , where the constant of proportionality ranges from 4 to 10 (or more) depending on the algorithm.

In general, we can define the cost as:

O (m 2 n + n 3) <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi data-mjx-variant="-tex-calligraphic" mathvariant="script">O</mi></mrow><mo stretchy="false">(</mo><msup><mi>m</mi><mn>2</mn></msup><mi>n</mi><mo>+</mo><msup><mi>n</mi><mn>3</mn></msup><mo stretchy="false">)</mo></math>

Reduced SVD

The SVD factorization of a non-square matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ of size $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ can be represented in a reduced format:

For $m \geq n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\geq</mo><mi>n</mi></math>$ : $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ is $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ , $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ is $n \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi><mo>\times</mo><mi>n</mi></math>$ , and $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ is $n \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi><mo>\times</mo><mi>n</mi></math>$
For $m \leq n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\leq</mo><mi>n</mi></math>$ : $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ is $m \times m <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>m</mi></math>$ , $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ is $m \times m <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>m</mi></math>$ , and $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ is $n \times m <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi><mo>\times</mo><mi>m</mi></math>$ (note if $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ is $n \times m <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi><mo>\times</mo><mi>m</mi></math>$ , then $V T <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><mi>T</mi></msup></math>$ is $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ )

The following figure depicts the reduced SVD factorization (in red) against the full SVD factorizations (in gray).

In general, we will represent the reduced SVD as:

A = U R Σ R V T R <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><mi>R</mi></msub><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><mi>R</mi></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><mi>R</mi><mi>T</mi></msubsup></math>

where $U R <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><mi>R</mi></msub></math>$ is a $m \times s <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>s</mi></math>$ matrix, $V R <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><mi>R</mi></msub></math>$ is a $n \times s <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi><mo>\times</mo><mi>s</mi></math>$ matrix, $Σ R <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><mi>R</mi></msub></math>$ is a $s \times s <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>s</mi><mo>\times</mo><mi>s</mi></math>$ matrix, and $s = min (m, n) <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>s</mi><mo>=</mo><mo data-mjx-texclass="OP" movablelimits="true">min</mo><mo stretchy="false">(</mo><mi>m</mi><mo>,</mo><mi>n</mi><mo stretchy="false">)</mo></math>$ .

Example: Computing the SVD

We begin with the following non-square matrix, $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$

A = [323882874187647] <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>3</mn></mtd><mtd><mn>2</mn></mtd><mtd><mn>3</mn></mtd></mtr><mtr><mtd><mn>8</mn></mtd><mtd><mn>8</mn></mtd><mtd><mn>2</mn></mtd></mtr><mtr><mtd><mn>8</mn></mtd><mtd><mn>7</mn></mtd><mtd><mn>4</mn></mtd></mtr><mtr><mtd><mn>1</mn></mtd><mtd><mn>8</mn></mtd><mtd><mn>7</mn></mtd></mtr><mtr><mtd><mn>6</mn></mtd><mtd><mn>4</mn></mtd><mtd><mn>7</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>

and we will compute the reduced form of the SVD (where here $s = 3 <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>s</mi><mo>=</mo><mn>3</mn></math>$ ):

(1) Compute $A T A <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ :

A T A = [174158106158197134106134127] <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>174</mn></mtd><mtd><mn>158</mn></mtd><mtd><mn>106</mn></mtd></mtr><mtr><mtd><mn>158</mn></mtd><mtd><mn>197</mn></mtd><mtd><mn>134</mn></mtd></mtr><mtr><mtd><mn>106</mn></mtd><mtd><mn>134</mn></mtd><mtd><mn>127</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>

(2) Compute the eigenvectors and eigenvalues of $A T A <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ :

λ 1 = 437.479, λ 2 = 42.6444, λ 3 = 17.8766, v 1 = [0.585051 0.652648 0.481418], v 2 = [- 0.710399 0.126068 0.692415], v 3 = [0.391212 - 0.747098 0.537398] <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msub><mi>λ</mi><mn>1</mn></msub><mo>=</mo><mn>437.479</mn><mo>,</mo><mstyle scriptlevel="0"><mspace width="1em"></mspace></mstyle><msub><mi>λ</mi><mn>2</mn></msub><mo>=</mo><mn>42.6444</mn><mo>,</mo><mstyle scriptlevel="0"><mspace width="1em"></mspace></mstyle><msub><mi>λ</mi><mn>3</mn></msub><mo>=</mo><mn>17.8766</mn><mo>,</mo><mspace linebreak="newline"></mspace><msub><mi mathvariant="bold-italic">v</mi><mn>1</mn></msub><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>0.585051</mn></mtd></mtr><mtr><mtd><mn>0.652648</mn></mtd></mtr><mtr><mtd><mn>0.481418</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow><mo>,</mo><mstyle scriptlevel="0"><mspace width="1em"></mspace></mstyle><msub><mi mathvariant="bold-italic">v</mi><mn>2</mn></msub><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mo>-</mo><mn>0.710399</mn></mtd></mtr><mtr><mtd><mn>0.126068</mn></mtd></mtr><mtr><mtd><mn>0.692415</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow><mo>,</mo><mstyle scriptlevel="0"><mspace width="1em"></mspace></mstyle><msub><mi mathvariant="bold-italic">v</mi><mn>3</mn></msub><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>0.391212</mn></mtd></mtr><mtr><mtd><mo>-</mo><mn>0.747098</mn></mtd></mtr><mtr><mtd><mn>0.537398</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>

(3) Construct $V R <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><mi>R</mi></msub></math>$ from the eigenvectors of $A T A <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ :

V R = [0.585051 - 0.710399 0.391212 0.652648 0.126068 - 0.747098 0.481418 0.692415 0.537398] . <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><mi>R</mi></msub><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>0.585051</mn></mtd><mtd><mo>-</mo><mn>0.710399</mn></mtd><mtd><mn>0.391212</mn></mtd></mtr><mtr><mtd><mn>0.652648</mn></mtd><mtd><mn>0.126068</mn></mtd><mtd><mo>-</mo><mn>0.747098</mn></mtd></mtr><mtr><mtd><mn>0.481418</mn></mtd><mtd><mn>0.692415</mn></mtd><mtd><mn>0.537398</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow><mo>.</mo></math>

(4) Construct $Σ R <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><mi>R</mi></msub></math>$ from the square roots of the eigenvalues of $A T A <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ :

Σ R = [20.916 00 0 6.53207 0 00 4.22807] <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><mi>R</mi></msub><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>20.916</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>6.53207</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>4.22807</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>

(5) Find $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ by solving $U Σ = A V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><mo>=</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ . For our reduced case, we can find $U R = A V R Σ - 1 R <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><mi>R</mi></msub><mo>=</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><mi>R</mi></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><mi>R</mi><mrow data-mjx-texclass="ORD"><mo>-</mo><mn>1</mn></mrow></msubsup></math>$ . You could also find $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ by computing the eigenvectors of $A A T <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup></math>$ .

U = [0.215371 0.030348 0.305490 0.519432 - 0.503779 - 0.419173 0.534262 - 0.311021 0.011730 0.438715 0.787878 - 0.431352 0.453759 0.166729 0.738082] <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>0.215371</mn></mtd><mtd><mn>0.030348</mn></mtd><mtd><mn>0.305490</mn></mtd></mtr><mtr><mtd><mn>0.519432</mn></mtd><mtd><mo>-</mo><mn>0.503779</mn></mtd><mtd><mo>-</mo><mn>0.419173</mn></mtd></mtr><mtr><mtd><mn>0.534262</mn></mtd><mtd><mo>-</mo><mn>0.311021</mn></mtd><mtd><mn>0.011730</mn></mtd></mtr><mtr><mtd><mn>0.438715</mn></mtd><mtd><mn>0.787878</mn></mtd><mtd><mo>-</mo><mn>0.431352</mn></mtd></mtr><mtr><mtd><mn>0.453759</mn></mtd><mtd><mn>0.166729</mn></mtd><mtd><mn>0.738082</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>

We obtain the following singular value decomposition for $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ :

Recall that we computed the reduced SVD factorization (i.e. $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ is square, $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ is non-square) here.

Rank, null space and range of a matrix

Suppose $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ is a $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ matrix where $m > n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>></mo><mi>n</mi></math>$ (without loss of generality):

We can re-write the above as:

A = [| | | | u 1 \dots u n | | | |] [- σ 1 v T 1 - ⋮ - σ n v T n -] <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd><mtd></mtd><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd></mtr><mtr><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd><mtd></mtd><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd></mtr><mtr><mtd><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mn>1</mn></msub></mtd><mtd><mo>\dots</mo></mtd><mtd><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mi>n</mi></msub></mtd></mtr><mtr><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd><mtd></mtd><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd></mtr><mtr><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd><mtd></mtd><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mo>-</mo></mtd><mtd><msub><mi>σ</mi><mn>1</mn></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mn>1</mn><mi>T</mi></msubsup></mtd><mtd><mo>-</mo></mtd></mtr><mtr><mtd></mtd><mtd><mrow data-mjx-texclass="ORD"><mo>⋮</mo></mrow></mtd><mtd></mtd></mtr><mtr><mtd><mo>-</mo></mtd><mtd><msub><mi>σ</mi><mi>n</mi></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mi>n</mi><mi>T</mi></msubsup></mtd><mtd><mo>-</mo></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>

Furthermore, the product of two matrices can be written as a sum of outer products:

A = σ 1 u 1 v T 1 + σ 2 u 2 v T 2 + . . . + σ n u n v T n <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><msub><mi>σ</mi><mn>1</mn></msub><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mn>1</mn></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mn>1</mn><mi>T</mi></msubsup><mo>+</mo><msub><mi>σ</mi><mn>2</mn></msub><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mn>2</mn></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mn>2</mn><mi>T</mi></msubsup><mo>+</mo><mo>.</mo><mo>.</mo><mo>.</mo><mo>+</mo><msub><mi>σ</mi><mi>n</mi></msub><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mi>n</mi></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mi>n</mi><mi>T</mi></msubsup></math>

For a general rectangular matrix, we have:

A = s \sum i = 1 σ i u i v T i <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><munderover><mo data-mjx-texclass="OP">\sum</mo><mrow data-mjx-texclass="ORD"><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mrow data-mjx-texclass="ORD"><mi>s</mi></mrow></munderover><msub><mi>σ</mi><mi>i</mi></msub><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mi>i</mi></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mi>i</mi><mi>T</mi></msubsup></math>

If $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ has $s <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>s</mi></math>$ non-zero singular values, the matrix is full rank, i.e. $rank (A) = s <math xmlns="http://www.w3.org/1998/Math/MathML"><mtext>rank</mtext><mo stretchy="false">(</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo stretchy="false">)</mo><mo>=</mo><mi>s</mi></math>$ .

If $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ has $r <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>r</mi></math>$ non-zero singular values, and $r < s <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>r</mi><mo><</mo><mi>s</mi></math>$ , the matrix is rank deficient, i.e. $rank (A) = r <math xmlns="http://www.w3.org/1998/Math/MathML"><mtext>rank</mtext><mo stretchy="false">(</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo stretchy="false">)</mo><mo>=</mo><mi>r</mi></math>$ .

In other words, the rank of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ equals the number of non-zero singular values which is the same as the number of non-zero diagonal elements in $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ .

Rounding errors may lead to small but non-zero singular values in a rank deficient matrix. Singular values that are smaller than a given tolerance are assumed to be numerically equivalent to zero, defining what is sometimes called the effective rank.

The right-singular vectors (columns of $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ ) corresponding to vanishing singular values of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ span the null space of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ , i.e. null( $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ ) = span{ $v r + 1 <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mrow data-mjx-texclass="ORD"><mi>r</mi><mo>+</mo><mn>1</mn></mrow></msub></math>$ , $v r + 2 <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mrow data-mjx-texclass="ORD"><mi>r</mi><mo>+</mo><mn>2</mn></mrow></msub></math>$ , …, $v n <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mrow data-mjx-texclass="ORD"><mi>n</mi></mrow></msub></math>$ }.

The left-singular vectors (columns of $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ ) corresponding to the non-zero singular values of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ span the range of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ , i.e. range( $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ ) = span{ $u 1 <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mrow data-mjx-texclass="ORD"><mn>1</mn></mrow></msub></math>$ , $u 2 <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mrow data-mjx-texclass="ORD"><mn>2</mn></mrow></msub></math>$ , …, $u r <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mrow data-mjx-texclass="ORD"><mi>r</mi></mrow></msub></math>$ }.

Example:

A=[1√2−1√2001√221√20000010010][14000140000000][100010001]<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd><mtd><mo>−</mo><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac><mn>2</mn></mtd><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>1</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>1</mn></mtd><mtd><mn>0</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>14</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>14</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>1</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>1</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>1</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>

The rank of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ is 2.

The vectors $[1√21√200]<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd></mtr><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd></mtr><mtr><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>$ and $[−1√21√200]<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mo>−</mo><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd></mtr><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd></mtr><mtr><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>$ provide an orthonormal basis for the range of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ .

The vector $[001] <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>1</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>$ provides an orthonormal basis for the null space of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ .

(Moore-Penrose) Pseudoinverse

If the matrix $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ is rank deficient, we cannot get its inverse. We define instead the pseudoinverse:

(Σ+)ii={1σiσi≠00σi=0<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mo stretchy="false">(</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><mo>+</mo></msup><msub><mo stretchy="false">)</mo><mrow data-mjx-texclass="ORD"><mi>i</mi><mi>i</mi></mrow></msub><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">{</mo><mtable columnalign="left left" columnspacing="1em" rowspacing=".2em"><mtr><mtd><mfrac><mn>1</mn><msub><mi>σ</mi><mi>i</mi></msub></mfrac></mtd><mtd><msub><mi>σ</mi><mi>i</mi></msub><mo>≠</mo><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><msub><mi>σ</mi><mi>i</mi></msub><mo>=</mo><mn>0</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE" fence="true" stretchy="true" symmetric="true"></mo></mrow></math>

For a general non-square matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ with known SVD ( $A = U Σ V T <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi><mi mathvariant="bold">Σ</mi><mi mathvariant="bold">V</mi></mrow><mi>T</mi></msup></math>$ ), the pseudoinverse is defined as:

A + = V Σ + U T <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mrow data-mjx-texclass="ORD"><mo>+</mo></mrow></msup><mo>=</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi><mi mathvariant="bold">Σ</mi></mrow><mrow data-mjx-texclass="ORD"><mo>+</mo></mrow></msup><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><mi>T</mi></msup></math>

For example, if we consider a $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ full rank matrix where $m > n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>></mo><mi>n</mi></math>$ :

Euclidean norm of matrices

The induced 2-norm of a matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ can be obtained using the SVD of the matrix :

And hence,

‖ A ‖_{2} = σ_{1}

In the above equations, all the notations for the norm $‖ . ‖$ refer to the $p = 2$ Euclidean norm, and we used the fact that $U$ and $V$ are orthogonal matrices and hence $‖ U ‖_{2} = ‖ V ‖_{2} = 1$ .

Example:

We begin with the following non-square matrix $A$ :

A = [\begin{array}{ccc} 3 & 2 & 3 \\ 8 & 8 & 2 \\ 8 & 7 & 4 \\ 1 & 8 & 7 \\ 6 & 4 & 7 \end{array}] .

The matrix of singular values, $Σ$ , computed from the SVD factorization is:

Σ = [\begin{array}{ccc} 20.916 & 0 & 0 \\ 0 & 6.53207 & 0 \\ 0 & 0 & 4.22807 \end{array}] .

Consequently the 2-norm of $A$ is

‖ A ‖_{2} = 20.916 .

Euclidean norm of the inverse of matrices

Following the same derivation as above, we can show that for a full rank $n \times n$ matrix we have:

‖ A^{- 1} ‖_{2} = \frac{1}{σ_{n}}

where $σ_{n}$ is the smallest singular value.

For non-square matrices, we can use the definition of the pseudoinverse (regardless of the rank):

‖ A^{+} ‖_{2} = \frac{1}{σ_{r}}

where $σ_{r}$ is the smallest non-zero singular value. Note that for a full rank square matrix, we have $‖ A^{+} ‖_{2} = ‖ A^{- 1} ‖_{2}$ . An exception of the definition above is the zero matrix. In this case, $‖ A^{+} ‖_{2} = 0$

2-Norm Condition Number

The 2-norm condition number of a matrix $A$ is given by the ratio of its largest singular value to its smallest singular value:

{cond}_{2} (A) = ‖ A ‖_{2} ‖ A^{- 1} ‖_{2} = σ_{max} / σ_{min} .

If the matrix $A$ is rank deficient, i.e. $rank (A) < min (m, n)$ , then ${cond}_{2} (A) = \infty$ .

Low-rank Approximation

The best rank- $k$ approximation for a $m \times n$ matrix $A$ , where $k < s = min (m, n)$ , for some matrix norm $‖ . ‖$ , is one that minimizes the following problem:

\begin{aligned} min_{A_{k}} ‖ A - A_{k} ‖ \\ such that rank (A_{k}) \leq k . \end{aligned}

Under the induced $2$ -norm, the best rank- $k$ approximation is given by the sum of the first $k$ outer products of the left and right singular vectors scaled by the corresponding singular value (where, $σ_{1} \geq \dots \geq σ_{s}$ ):

A_{k} = σ_{1} u_{1} v_{1}^{T} + \dots σ_{k} u_{k} v_{k}^{T}

Observe that the norm of the difference between the best approximation and the matrix under the induced $2$ -norm condition is the magnitude of the $(k + 1)^{th}$ singular value of the matrix:

‖ A - A_{k} ‖_{2} = {| | \sum_{i = k + 1}^{n} σ_{i} u_{i} v_{i}^{T} | |}_{2} = σ_{k + 1}

Note that the best rank- $k$ approximation to $A$ can be stored efficiently by only storing the $k$ singular values $σ_{1}, \dots, σ_{k}$ , the $k$ left singular vectors $u_{1}, \dots, u_{k}$ , and the $k$ right singular vectors $v_{1}, \dots, v_{k}$ .

The figure below show best rank- $k$ approximations of an image (you can find the code snippet that generates these images in the IPython notebook):

Review Questions

See this review link

ChangeLog

2020-04-26 Mariana Silva mfsilva@illinois.edu: adding more details to sections
2018-11-14 Erin Carrier ecarrie2@illinois.edu: spelling fix
2018-10-18 Erin Carrier ecarrie2@illinois.edu: correct svd cost
2018-01-14 Erin Carrier ecarrie2@illinois.edu: removes demo links
2017-12-04 Arun Lakshmanan lakshma2@illinois.edu: fix best rank approx, svd image
2017-11-15 Erin Carrier ecarrie2@illinois.edu: adds review questions, adds cond num sec, removes normal equations, minor corrections and clarifications
2017-11-13 Arun Lakshmanan lakshma2@illinois.edu: first complete draft
2017-10-17 Luke Olson lukeo@illinois.edu: outline