Singular Value Decompositions
Learning Objectives
- Construct an SVD of a matrix
- Identify pieces of an SVD
- Use an SVD to solve a problem
Links and Other Content
Nothing here yet.
Singular Value Decomposition
An m×n real matrix A has a singular value decomposition of the form A=UΣV⊤, where
- U is an m×m orthogonal matrix whose columns are eigenvectors of AA⊤. The columns of U are called the left singular vectors of A.
- Σ is an m×n diagonal matrix of the form: Σ=[σ1⋱σn00⋮⋱⋮00], where σ1≥σ2⋯≥σn≥0 are the square roots of the eigenvalues values of A⊤A. The diagonal entries are called the singular values of A.
- V is an n×n orthogonal matrix whose columns are eigenvectors of A⊤A The columns of V are called the right singular vectors of A.
Note: The time-complexity for computing the SVD factorization of an arbitary m×n matrix is O(m2n+n3).
Example: Computing the SVD
We begin with the following non-square matrix, A: A=[323882874187647].
Compute A⊤A: A⊤A=[174158106158197134106134127].
Compute the eigenvectors and eigenvalues of A⊤A: λ1=437.479,λ2=42.6444,λ3=17.8766,v1=[0.5850510.6526480.481418],v2=[−0.7103990.1260680.692415],v3=[0.391212−0.7470980.537398].
Construct V from the eigenvectors of A⊤A: V=[0.585051−0.7103990.3912120.6526480.126068−0.7470980.4814180.6924150.537398].
Construct Σ from the square roots of the eigenvalues of A⊤A: Σ=[20.9160006.532070004.22807]
Find U by solving UΣ=AV. For our simple case, we can find U=AVΣ−1. You could also find U by computing the eigenvectors of AA⊤. U=A⏞[323882874187647]V⏞[0.585051−0.7103990.3912120.6526480.126068−0.7470980.4814180.6924150.537398]Σ−1⏞[0.0478100.00.00.00.1531330.00.00.00.236515]U=[0.2153710.0303480.3054900.519432−0.503779−0.4191730.534262−0.3110210.0117300.4387150.787878−0.4313520.4537590.1667290.738082]
We obtain the following singular value decomposition for A: A⏞[323882874187647]=U⏞[0.2153710.0303480.3054900.519432−0.503779−0.4191730.534262−0.3110210.0117300.4387150.787878−0.4313520.4537590.1667290.738082]Σ⏞[20.9160006.532070004.22807]V⊤⏞[0.5850510.6526480.481418−0.7103990.1260680.6924150.391212−0.7470980.537398]
Note: We computed the reduced SVD factorization (i.e. Σ is square, U is non-square) here.
SVD: Solving the Least Squares Problem
Given the following linear equation Ax≅b, where A is a non-square matrix whose SVD factorization (A=UΣV⊤) is known, we can compute the solution which minimizes the sum of squared differences as: x=VΣ+U⊤b, where Σ+ is the pseudoinverse of the singular matrix computed by taking the reciprocal of the non-zero diagonal entries, leaving the zeros in place and transposing the resulting matrix. For example: Σ=[σ1⋱σn00⋮⋱⋮00]⟹Σ+=[1σ10…0⋱⋱1σn0…0].
Note: Solving the least squares problem has a time complexity of O(max for a known full SVD factorization, and O(mn) for a known reduced SVD factorization.
(Moore-Penrose) Pseudoinverse
Given a non-square matrix A whose SVD factorization (A = U\Sigma V^\top) is known, the pseudoinverse of A is defined as: A^{+} = V\Sigma^{+}U^\top
Reduced SVD
The SVD factorization of a non-square matrix A of size m \times n can be reprsented in a compact fashion:
- For m \ge n: U is m \times n, \Sigma is n \times n, and V is n \times n
- For m \le n: U is m \times m, \Sigma is m \times m, and V is n \times m (note if V is n \times m, then V^\top is m \times n)
The following figure depicts the reduced SVD factorization (in red) against the full SVD factorizations (in gray).
2-Norm based on Singular Values
The induced 2-norm of a matrix, A, is given by its largest singular value: \|A\|_2 = \max_{\|y\| = 1} \|\Sigma y\|_2 = \sigma_1, where \Sigma is the diagonal matrix of singular values of A.
Example: 2-Norm based on Singular Values
We begin with the following non-square matrix, A: A = \left[ \begin{array}{ccc} 3 & 2 & 3 \\ 8 & 8 & 2 \\ 8 & 7 & 4 \\ 1 & 8 & 7 \\ 6 & 4 & 7 \\ \end{array} \right].
The matrix of singular values, \Sigma, computed from the SVD factorization is: \Sigma = \left[ \begin{array}{ccc} 20.916 & 0 & 0 \\ 0 & 6.53207 & 0 \\ 0 & 0 & 4.22807 \\ \end{array} \right].
Following our definition, we have the 2-norm of A as \|A\|_2 = 20.916.
Computing 2-Norm Condition Number from Singular Values
The 2-norm condition number of a matrix A is given by the ratio of its largest singular value to its smallest singular value: \text{cond}_2(A) = \|A\|_2 \|A^{-1}\|_2 = \sigma_{\max}/\sigma_{\min}. If \text{rank}(A) < \min(m,n), then \sigma_{\min} = 0 so \text{cond}_2(A) = \infty.
Low-rank Approximation
The best rank-k approximation for a m \times n matrix A, (where, k \le m and k \le n) for some matrix norm \| \cdot \| is one that minimizes the following problem: \begin{aligned} &\min_{A_k} \ \|A - A_k\| \\ &\textrm{such that} \quad \mathrm{rank}(A_k) \le k. \end{aligned}
Under the induced 2-norm, the best rank-k approximation is given by the sum of the first k outer products of the left and right singular vectors scaled by the corresponding singular value (where, \sigma_1 \ge \dots \ge \sigma_n): A_k = \sigma_1 \boldsymbol{u}_1 \boldsymbol{v}_1^\top + \dots \sigma_k \boldsymbol{u}_k \boldsymbol{v}_k^\top
Observe that the norm of the difference between the best approximation and the matrix under the induced 2-norm condition is the magnitude of the (k+1)^\text{th} singular value of the matrix: \|A - A_k\|_2 = \left|\left|\sum_{i=k+1}^n \sigma_i \boldsymbol{u}_i \boldsymbol{v}_i^\top\right|\right|_2 = \sigma_{k+1}
Example: Low-rank Approximation
The following code snippet performs best rank-k approximation (under 2-norm assumptions) on an image:
import numpy as np
def bestk(A, k):
U,S,V = np.linalg.svd(A, full_matrices=False)
return U[:, :k] @ np.diag(S[:k]) @ V[:k, :]
from urllib.request import urlopen
from scipy.misc import imread
import matplotlib.pyplot as plt
url = 'https://upload.wikimedia.org/wikipedia/en/thumb/a/a8/Foellinger_Auditorium.jpg/640px-Foellinger_Auditorium.jpg'
with urlopen(url) as img:
foellinger = imread(img, mode='F')
low_10 = bestk(foellinger, 10)
low_20 = bestk(foellinger, 20)
low_50 = bestk(foellinger, 50)
plt.subplot(2, 2, 1)
plt.imshow(low_10, cmap='gray')
plt.title("k = 10")
plt.subplot(2, 2, 2)
plt.imshow(low_20, cmap='gray')
plt.title("k = 20")
plt.subplot(2, 2, 3)
plt.imshow(low_50, cmap='gray')
plt.title("k = 50")
plt.subplot(2, 2, 4)
plt.imshow(foellinger, cmap='gray')
plt.title("k = 480")
plt.show()

Review Questions
- For a matrix A with SVD decomposition A = U \Sigma V^\top, what are the columns of U and how can we find them? What are the columns of V and how can we find them? What are the entries of \Sigma and how can we find them?
- What special properties are true of U? V? \Sigma?
- What are the shapes of U, \Sigma, and V in the full SVD of an m \times n matrix?
- What are the shapes of U, \Sigma, and V in the reduced SVD of an m \times n matrix?
- What is the cost of computing the SVD?
- Given an already computed SVD of a matrix A, what is the cost of using the SVD to solve a linear system A\boldsymbol{x} = \boldsymbol{b}? How would you use the SVD to solve this system?
- Given an already computed SVD of a matrix A, what is the cost of using the SVD to solve a least squares problem A \boldsymbol{x} \cong \boldsymbol{b}? How do you solve a least squares problem using the SVD?
- How do you use the SVD to compute a low-rank approximation of a matrix? For a small matrix, you should be able to compute a given low rank approximation (i.e. rank-one, rank-two).
- Given the SVD of a matrix A, what is the SVD of A^{+} (the psuedoinverse of A)?
- Given the SVD of a matrix A, what is the 2-norm of the matrix? What is the 2-norm condition number of the matrix?
ChangeLog
- 2017-12-04 Arun Lakshmanan lakshma2@illinois.edu: fix best rank approx, svd image
- 2017-11-15 Erin Carrier ecarrie2@illinois.edu: adds review questions, adds cond num sec, removes normal equations, minor corrections and clarifications
- 2017-11-13 Arun Lakshmanan lakshma2@illinois.edu: first complete draft
- 2017-10-17 Luke Olson lukeo@illinois.edu: outline