- Compute eigenvalue/eigenvector for various applications.
- Use the Power Method to find an eigenvector.

An ** eigenvalue** of an matrix is a scalar such that
for some non-zero vector . The eigenvalue can be any real or complex scalar, (which we write ). Eigenvalues can be complex even if all the entries of the matrix are real. In this case, the corresponding vector must have complex-valued components (which we write ). The equation is called the

The eigenvalue equation can be rearranged to , and because is not zero this has solutions if and only if is a solution of the ** characteristic equation**:

The expression is called the ** characteristic polynomial** and is a polynomial of degree .

Although all eigenvalues can be found by solving the characteristic equation, there is no general, closed-form analytical solution for the roots of polynomials of degree and this is not a good numerical approach for finding eigenvalues.

Unless otherwise specified, we write eigenvalues ordered by magnitude, so that

and we normalize eigenvectors, so that .

Given a matrix , for any constant scalar , we define the ** shifted matrix** is . If is an eigenvalue of with eigenvector then is an eigenvalue of the shifted matrix with the same eigenvector. This can be derived by

An invertible matrix cannot have an eigenvalue equal to zero. Furthermore, the eigenvalues of the inverse matrix are equal to the inverse of the eigenvalues of the original matrix:

Similarly, we can describe the eigenvalues for shifted inverse matrices as:

It is important to note here, that the eigenvectors remain unchanged for shifted or/and inverted matrices.

An matrix with linearly independent eigenvectors can be expressed as its eigenvalues and eigenvectors as:

The eigenvector matrix can be inverted to obtain the following ** similarity transformation** of :

Multiplying the matrix by on the left and on the right transforms it into a diagonal matrix; it has been ‘‘diagonalized’’.

A matrix is diagonalizable if and only if it has linearly independent eigenvectors. For example:

A matrix with linearly dependent eigenvectors is not diagonalizable. For example, while it is true that

the matrix does not have an inverse, so we cannot diagonalize by applying an inverse. In fact, for any non-singular matrix , the product is not diagonal.

If an matrix is diagonalizable, then we can write an arbitrary vector as a linear combination of the eigenvectors of . Let be linearly independent eigenvectors of ; then an arbitrary vector can be written:

If we apply the matrix to :

If we repeatedly apply we have

In the case where one eigenvalue has magnitude that is strictly greater than all the others, i.e.

,

this implies

This observation motivates the algorithm known as ** power iteration**, which is the topic of the next section.

For a matrix , power iteration will find a scalar multiple of an eigenvector , corresponding to the dominant eigenvalue (largest in magnitude) , provided that is strictly greater than the magnitude of the other eigenvalues, i.e., .

Suppose

.

From the previous section, the iterative sequence

satisfies

.

Thus, for large , . Unfortunately, this mean that
which will be very large if , or very small if . For this reason, we use ** normalized** power iteration.

Normalized power iteration, is given by the following. Let be a vector with unit norm: (any norm is fine), with .

** Normalized power iteration** is defined by the following iterative sequence for :

where the norm is identical to the norm used when we assumed .

It can be shown that this sequence satisfies

This means that for large values of , we have

The largest eigenvalue could be positive, negative, or a complex number. In each case we will have:

Strictly speaking, normalized power iteration only converges to a single vector if , but will be close to a scalar multiple of the eigenvector for large values of , regardless of whether the dominant eigenvalue is positive, negative, or complex. So normalized power iteration will work for any value of , as long as it is strictly bigger in magnitude than the other eigenvalues.

The following code snippet performs power iteration:

```
import numpy as np
def power_iter(A, x_0, p):
# A: nxn matrix, x_0: initial guess, p: type of norm
x_0 = x_0/np.linalg.norm(x_0,p)
x_k = x_0
for i in range(max_iterations):
y_k = A @ x_k
x_k = y_k/np.linalg.norm(y_k,p)
return x_k
```

We’ll use normalized power iteration (with the infinity norm) to approximate an eigenvector of the following matrix: and the following initial guess:

**First Iteration**:

**Second Iteration**:

Even after only two iterations, we are getting close to a corresponding eigenvector:

with relative error about 4 percent when measured in the infinity norm.

Power iteration allows us to find an approximate eigenvector corresponding to the largest eigenvalue in magnitude. How can we compute the actual eigenvalue from this? If , then we can compute the value of using the ** Rayleigh Quotient**:

Thus, one can compute an approximate eigenvalue using the approximate eigenvector found during power iteration.

Recall that we made the assumption that the initial guess satisfies

.

What happens if we choose an initial guess where ? If we further assume that , then in theory

and we would expect that

In practice, this does not happen. For one thing, choosing an initial guess such that is extremely unlikely if we have no prior knowledge about the eigenvector . Since power iteration is performed numerically, using finite precision arithmetic, we will encounter the presence of rounding error in every iteration. This means that at every iteration , we will instead have

where the are the approximate expansion coefficients of the rounded result. Even if , the finite precision representation , will very likely have expansion coefficient . Even in the case where rounding the initial guess does not introduce a non-zero , rounding after applying the matrix will almost certainly introduce a non-zero component in the dominant eigenvector after enough iterations. The probability of coming up with a starting guess such that for all iterations is very, very low, if not impossible.

Above, we assumed that one eigenvalue had magnitude strictly larger than all the others: . What happens if ?

If , then:

hence

.

The quantity is still an eigenvector corresponding to , so power iteration will still approach a dominant eigenvector.

If the dominant eigenvalues have opposite sign, i.e., , then

For large , we will have , which although is a linear combination of two eigenvectors, is ** not** itself an eigenvector of .

Finally, if the two dominant eigenvalues are a complex-conjugate pair , then

For large , approximate a linear combination of two eigenvectors, but this linear combination will not itself be an eigenvector.

To obtain an eigenvector corresponding to the ** smallest** eigenvalue of a non-singular matrix, we can apply power iteration to . The following recurrence relationship describes inverse iteration algorithm:

To obtain an eigenvector corresponding to the eigenvalue closest to some value , can be shifted by and inverted in order to solve it similarly to the power iteration algorithm. The following recurrence relationship describes inverse iteration algorithm: . Note that this is identical to inverse iteration if the shift is zero.

The shift can be updated based on a current estimate of the eigenvalue in order to improve convergence rate. Such an estimate can be found using the Rayleigh Quotient. Rayleigh Quotient Iteration is given by the following recurrence relation:

The convergence rate for power iteration is ** linear** and the recurrence relationship for the error between the current iterate and a dominant eigenvector is given by:
The convergence rate for (shifted) inverse iteration is also linear, but now depends on the two closest eigenvalues to the shift . (Standard inverse iteration corresponds to a shift . The recurrence relationship for the errors is given by:

Square matrices are called *orthogonal* if and only if the columns are mutually orthogonal to one another and have a norm of (such a set of vectors are formally known as a *orthonormal* set), i.e.:
or
where is the set of all orthogonal matrices called the orthogonal group, , , are the columns of , and is the inner product operator. Orthogonal matrices have many desirable properties:

The algorithm to construct an orthogonal basis from a set of linearly independent vectors is called the Gram-Schmidt process. For a basis set , we can form a orthogonal set given by the following transformation: where is the inner product operator. Each of the vectors in the orthogonal set can be normalized independently to obtain a orthonormal basis.

- See this review link

- 2020-03-01 Peter Sentz: added text to include content from slides
- 2018-10-14 Erin Carrier ecarrie2@illinois.edu: removes orthogonal/GS sections
- 2018-01-14 Erin Carrier ecarrie2@illinois.edu: removes demo links
- 2017-11-10 Erin Carrier ecarrie2@illinois.edu: adds costs of methods
- 2017-10-26 Matthew West mwest@illinois.edu: rewrote eval/evec definitions
- 2017-10-25 Erin Carrier ecarrie2@illinois.edu: minor fixes, added review questions
- 2017-10-14 Arun Lakshmanan lakshma2@illinois.edu: first complete draft
- 2017-10-16 Luke Olson lukeo@illinois.edu: outline