# Condition Numbers

## Learning Objectives

• Compute the condition number
• Quantify the impact of a high condition number

## Condition Number Definition

The condition number of a square nonsingular matrix $${\bf A}$$ is defined by $$\text{cond}({\bf A}) = \kappa({\bf A}) = \|{\bf A}\| \|{\bf A}^{-1}\|$$ which is also the condition number associated with solving the linear system $${\bf A} \boldsymbol{x} = \boldsymbol{b}$$. A matrix with a large condition number is said to be ill-conditioned.

The condition number can be measured with any $$p$$-norm, so to be precise we typically specify the norm being used, i.e. $$\text{cond}_2$$, $$\text{cond}_1, \text{cond}_{\infty}$$.

If $${\bf A}$$ is singular, we can define $$\text{cond}({\bf A}) = \infty$$ by convention.

## Perturbed Matrix Problem and Error Bound

Let $$\boldsymbol{x}$$ be the solution of $${\bf A} \boldsymbol{x} = \boldsymbol{b}$$ and $$\hat{\boldsymbol{x}}$$ be the solution of the perturbed problem $${\bf A} \hat{\boldsymbol{x}} = \boldsymbol{b} + \Delta \boldsymbol{b}$$. Let $$\Delta \boldsymbol{x} = \hat{\boldsymbol{x}} - \boldsymbol{x}$$ be the absolute error in output. Then we have $${\bf A} \boldsymbol{x} + {\bf A} \Delta \boldsymbol{x} = \boldsymbol{b} + \Delta \boldsymbol{b},$$ so $${\bf A} \Delta \boldsymbol{x} = \Delta \boldsymbol{b}.$$ Now we want to see how the relative error in output $$\left(\frac{\|\Delta \boldsymbol{x}\|}{\|\boldsymbol{x}\|}\right)$$ is related to the relative error in input $$\left(\frac{\|\Delta \boldsymbol{b}\|}{\|\boldsymbol{b}\|}\right)$$:

\begin{align} \frac{\|\Delta \boldsymbol{x}\| / \|\boldsymbol{x}\|}{\|\Delta \boldsymbol{b}\| / \|\boldsymbol{b}\|} &= \frac{\|\Delta \boldsymbol{x}\| \|\boldsymbol{b}\|}{\|\boldsymbol{x}\| \|\Delta \boldsymbol{b}\|}\\ &= \frac{\|{\bf A}^{-1} \Delta \boldsymbol{b}\| \|{\bf A} \boldsymbol{x}\|}{\|\boldsymbol{x}\| \|\Delta \boldsymbol{b}\|}\\ &\le \frac{\|{\bf A}^{-1}\| \|\Delta \boldsymbol{b}\| \|{\bf A}\| \|\boldsymbol{x}\|}{\|\boldsymbol{x}\| \|\Delta \boldsymbol{b}\|} \\ &= \|{\bf A}^{-1}\| \|{\bf A}\|\\ &= \text{cond}({\bf A}) \end{align}

where we used $$\|{\bf A}\boldsymbol{x}\| \le \|{\bf A}\| \|\boldsymbol{x}\|, \forall \boldsymbol{x}$$

Then

$\frac{\|\Delta \boldsymbol{x}\|}{\|\boldsymbol{x}\|} \le \text{cond}({\bf A})\frac{\|\Delta \boldsymbol{b}\|}{\|\boldsymbol{b}\|} \qquad (1)$

Therefore, if we know the relative error in input, then we can use the condition number of the system to obtain an upper bound for the relative error of our computed solution (output).

## Residual vs Error

The residual vector $$\boldsymbol{r}$$ of approximate solution $$\hat{\boldsymbol{x}}$$ for the linear system $${\bf A} \boldsymbol{x} = \boldsymbol{b}$$ is defined as $$\boldsymbol{r} = \boldsymbol{b} - {\bf A} \hat{\boldsymbol{x}}$$. In the perturbed matrix problem described above, we have

$\boldsymbol{r} = \boldsymbol{b} - (\boldsymbol{b} + \Delta \boldsymbol{b}) = -\Delta \boldsymbol{b}$

Therefore, equation (1) can also be written as

$\frac{\|\Delta \boldsymbol{x}\|}{\|\boldsymbol{x}\|} \le \text{cond}({\bf A})\frac{\|\boldsymbol{r}\|}{\|\boldsymbol{b}\|}$

If we define relative residual as $$\frac{\|\boldsymbol{r}\|}{\|\boldsymbol{b}\|}$$, we can see that small relative residual implies small relative error in approximate solution only if $${\bf A}$$ is well-conditioned ($$\text{cond}({\bf A})$$ is small).

## Gaussian Elimination (with Partial Pivoting) is Guaranteed to Produce a Small Residual

When we use Gaussian elimination with partial pivoting to compute the solution for the linear system $${\bf A} \boldsymbol{x} = \boldsymbol{b}$$ and obtain an approximate solution $$\hat{\boldsymbol{x}}$$, the residual vector $$\boldsymbol{r}$$ satisfies:

$\frac{\|\boldsymbol{r}\|}{\|{\bf A}\| \|\hat{\boldsymbol{x}}\|} \le \frac{\|E\|}{\|{\bf A}\|} \le c \epsilon_{mach}$

where $$E$$ is backward error in $${\bf A}$$ (which is defined by $$({\bf A}+E)\hat{\boldsymbol{x}} = \boldsymbol{b}$$), $$c$$ is a coefficient related to $${\bf A}$$ and $$\epsilon_{mach}$$ is machine epsilon.

Typically $$c$$ is small with partial pivoting, but $$c$$ can be arbitrarily large without pivoting.

Therefore, Gaussian elimination with partial pivoting yields small relative residual regardless of conditioning of the system.

For more details, see Gaussian Elimination & Roundoff Error.

## Accuracy Rule of Thumb and Example

Suppose we apply Gaussian elimination with partial pivoting and back substitution to the linear system $${\bf A} \boldsymbol{x} = \boldsymbol{b}$$ and obtain a computed solution $$\hat{\boldsymbol{x}}$$. If the entries in $${\bf A}$$ and $$\boldsymbol{b}$$ are accurate to $$s$$ decimal digits, and $$\text{cond}({\bf A}) \approx 10^t$$, then the elements of the solution vector $$\hat{\boldsymbol{x}}$$ will be accurate to about $$s-t$$ decimal digits.

For a proof of this rule of thumb, please see Fundamentals of Matrix Computations by David S. Watkins.

Example: How many accurate decimal digits in the solution can we expect to obtain if we solve a linear system $${\bf A} \boldsymbol{x} = \boldsymbol{b}$$ where $$\text{cond}({\bf A}) = 10^{10}$$ using Gaussian elimination with partial pivoting, assuming we are using IEEE double precision and the inputs are accurate to machine precision?

In IEEE double precision, $$\epsilon_{mach} \approx 2.2\times 10^{-16}$$, which means the entries in $${\bf A}$$ and $$\boldsymbol{b}$$ are accurate to $$\vert\log_{10}(2.2\times 10^{-16})\vert \approx 16$$ decimal digits.

Then, using the rule of thumb, we know the entries in $$\hat{\boldsymbol{x}}$$ will be accurate to about $$16-10 = 6$$ decimal digits.

## ChangeLog

• 2017-10-27 Erin Carrier ecarrie2@illinois.edu: adds review questions, minor fixes throughout, revised rule of thumb wording
• 2017-10-27 Yu Meng <yumeng5@illinois.edu: first complete draft
• 2017-10-17 Luke Olson <lukeo.illinois.edu: outline