Errors and Complexity

Learning objectives

Compare and contrast relative and absolute error
Categorize a cost as $O (n p) <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi data-mjx-variant="-tex-calligraphic" mathvariant="script">O</mi></mrow><mo stretchy="false">(</mo><msup><mi>n</mi><mi>p</mi></msup><mo stretchy="false">)</mo></math>$
Categorize an error $O (h p) <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi data-mjx-variant="-tex-calligraphic" mathvariant="script">O</mi></mrow><mo stretchy="false">(</mo><msup><mi>h</mi><mi>p</mi></msup><mo stretchy="false">)</mo></math>$
Identify algebraic vs exponential growth and convergence

Big Picture

Numerical algorithms are distinguished by their cost and error, and the tradeoff between them.
The algorithms or methods introduced in this course indicate their error and cost whenever possible. These might be exact expressions or asymptotic bounds like $O (h 2) <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi data-mjx-variant="-tex-calligraphic" mathvariant="script">O</mi></mrow><mo stretchy="false">(</mo><msup><mi>h</mi><mn>2</mn></msup><mo stretchy="false">)</mo></math>$ as $h \to 0 <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>h</mi><mo accent="false" stretchy="false">\to</mo><mn>0</mn></math>$ or $O (n 3) <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi data-mjx-variant="-tex-calligraphic" mathvariant="script">O</mi></mrow><mo stretchy="false">(</mo><msup><mi>n</mi><mn>3</mn></msup><mo stretchy="false">)</mo></math>$ as $n \to \infty <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi><mo accent="false" stretchy="false">\to</mo><mi mathvariant="normal">\infty</mi></math>$ . For asymptotics we always indicate the limit.

Absolute and Relative Error

Results computed using numerical methods are inaccurate – they are approximations to the true values. We can represent an approximate result as a combination of the true value and some error:

Approximate Result = True Value + Error <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mtable displaystyle="true" columnalign="right" columnspacing="" rowspacing="3pt"><mtr><mtd><mtext>Approximate Result</mtext><mo>=</mo><mtext>True Value</mtext><mo>+</mo><mtext>Error</mtext></mtd></mtr></mtable></math>

ˆ x = x 0 + Δ x <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mover><mi>x</mi><mo stretchy="false">^</mo></mover></mrow><mo>=</mo><msub><mi>x</mi><mn>0</mn></msub><mo>+</mo><mi mathvariant="normal">Δ</mi><mi>x</mi></math>

Given this problem setup we can define the absolute error as:

Absolute Error = | x 0 - ˆ x | . <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mtext>Absolute Error</mtext><mo>=</mo><mrow data-mjx-texclass="ORD"><mo stretchy="false">|</mo></mrow><msub><mi>x</mi><mn>0</mn></msub><mo>-</mo><mrow data-mjx-texclass="ORD"><mover><mi>x</mi><mo stretchy="false">^</mo></mover></mrow><mrow data-mjx-texclass="ORD"><mo stretchy="false">|</mo></mrow><mo>.</mo></math>

This tells us how close our approximate result is to the actual answer. However, absolute error can become an unsatisfactory and misleading representation of the error depending on the magnitude of $x 0 <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>0</mn></msub></math>$ .

Case 1	Case 2
$x 0 = 0.1 <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>0</mn></msub><mo>=</mo><mn>0.1</mn></math>$ , $ˆ x = 0.2 <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mover><mi>x</mi><mo stretchy="false">^</mo></mover></mrow><mo>=</mo><mn>0.2</mn></math>$	$x 0 = 100.0 <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>x</mi><mn>0</mn></msub><mo>=</mo><mn>100.0</mn></math>$ , $ˆ x = 100.1 <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mover><mi>x</mi><mo stretchy="false">^</mo></mover></mrow><mo>=</mo><mn>100.1</mn></math>$
$∣ x 0 - ˆ x ∣= <math xmlns="http://www.w3.org/1998/Math/MathML"><mo>∣</mo><msub><mi>x</mi><mn>0</mn></msub><mo>-</mo><mrow data-mjx-texclass="ORD"><mover><mi>x</mi><mo stretchy="false">^</mo></mover></mrow><mo>∣=</mo><mn>0.1</mn></math>$	$∣ x_{0} - \hat{x} ∣= 0.1$

In both of these cases, the absolute error is the same, 0.1. However, we would intuitively consider case 2 more accurate than case 1 since our approximation is double the true value in case 1. Because of this, we define the relative error, which will be an error estimate independent of the magnitude. To obtain this we simply divide the absolute error by the absolute value of the true value.

Relative Error = \frac{| x_{0} - \hat{x} |}{| x_{0} |}

If we consider the two cases again, we can see that the relative error will be much lower in the second case.

Case 1	Case 2
$x_{0} = 0.1$ , $\hat{x} = 0.2$	$x_{0} = 100.0$ , $\hat{x} = 100.1$
$\frac{∣ x_{0} - \hat{x} ∣}{∣ x_{0} ∣} = 1$	$\frac{∣ x_{0} - \hat{x} ∣}{∣ x_{0} ∣} = 10^{- 3}$

Significant Digits/Figures

Significant figures of a number are digits that carry meaningful information. They are digits beginning from the leftmost nonzero digit and ending with the rightmost “correct” digit, including final zeros that are exact. For example:

The number 3.14159 has six significant digits.
The number 0.00035 has two significant digits.
The number 0.000350 has three significant digits.

An approximate result $\hat{x}$ has $n$ significant figures of a true value $x_{0}$ if the absolute error, $| x_{0} - \hat{x} |$ , has zeros in the first $n$ decimal places counting from the leftmost nonzero (leading) digit of $x_{0}$ , followed by a digit from 0 to 4.

Example: Assume $x_{0} = 3.141592653$ and suppose $\hat{x}$ is the approximate result:

\hat{x} = 3.14159 ⟶ | x_{0} - \hat{x} | = 0.00000 2653 ⟶ \hat{x} has 6 significant figures.

\hat{x} = 3.1415 ⟶ | x_{0} - \hat{x} | = 0.0000 92653 ⟶ \hat{x} has 4 significant figures.

The number of accurate digits can be estimated by the relative error. If

Relative Error = \frac{| x_{0} - \hat{x} |}{| x_{0} |} \leq 10^{- n + 1}

then $\hat{x}$ has at most $n$ correct digits. Conversely, if an approximation has $n$ correct digits, then the relative error is $\leq 10^{- n + 1}$ .

Absolute and Relative Error of Vectors

If our calculated quantities are vectors then instead of using the absolute value function, we can use the norm instead. Thus, our formulas become $Absolute Error = ‖ x - \hat{x} ‖$

Relative Error = \frac{‖ x - \hat{x} ‖}{‖ x ‖}

We take the norm of the difference (and not the difference of the norms), because we are interested in how far apart these two quantities are. This formula is similar to finding that difference then using the vector norm to find the length of that difference vector.

Truncation Error vs. Rounding Error

Rounding error is the error that occurs from rounding values in a computation. This occurs constantly since computers use finite precision. Approximating $\frac{1}{3} = 0.33333 \dots$ with a finite decimal expansion is an example of rounding error.

Truncation error is the error from using an approximate algorithm in place of an exact mathematical procedure or function. For example, in the case of evaluating functions, we may represent our function by a finite Taylor series up to degree $n$ . The truncation error is the error that is incurred by not using the $n + 1$ term and above.

Big-O Notation

Big-O notation is used to understand and describe asymptotic behavior. The definition in the cases of approching 0 or $\infty$ are as follows:

Let $f$ and $g$ be two functions. Then $f (x) = O (g (x))$ as $x \to \infty$ if and only if there exists a value $M$ and some $x_{0}$ such that $| f (x) | \leq M | g (x) |$ $\forall x$ where $x \geq x_{0}$

Let $f$ and $g$ be two functions. Then $f (h) = O (g (h))$ as $h \to 0$ if and only if there exists a value $M$ and some $h_{0}$ such that $| f (h) | \leq M | g (h) |$ $\forall h$ where $0 < h < h_{0}$

But what if we want to consider the function approaching an arbitrary value? Then we can redefine the expression as:

Let $f$ and $g$ be two functions. Then $f (x) = O (g (x))$ as $x \to a$ if and only if there exists a value $M$ and some $δ$ such that $| f (x) | \leq M | g (x) |$ $\forall x$ where $0 < | x - a | < δ$

Big-O Examples - Time Complexity

We can use Big-O to describe the time complexity of our algorithms.

Consider the case of matrix-matrix multiplication. If the size of each of our matrices is $n \times n$ , then the time it will take to multiply the matrices is $O (n^{3})$ meaning that $Run time \approx C \cdot n^{3}$ . Suppose we know that for $n_{1} = 1000$ , the matrix-matrix multiplication takes 5 seconds. Estimate how much time it would take if we double the size of our matrices to $2 n \times 2 n$ .

We know that:

\begin{aligned} Time (2 n_{1}) & \approx C \cdot (2 n_{1})^{3} \\ = C \cdot 2^{3} \cdot n_{1}^{3} \\ = 8 \cdot (C \cdot n_{1}^{3}) \\ = 8 \cdot Time (n_{1}) \\ = 40 seconds \end{aligned}

So, when we double the size of our our matrices to $2 n \times 2 n$ , the time becomes $(2 n)^{3} = 8 n^{3}$ . Thus, the runtime will be roughly 8 times as long.

Big-O Examples - Truncation Errors

We can also use Big-O notation to describe the truncation error. A numerical method is called $n$ -th order accurate if its truncation error $E (h)$ obeys $E (h) = O (h^{n})$ .

Consider solving an interpolation problem. We have an interval of length $h$ where our interpolant is valid and we know that our approximation is order $O (h^{2})$ . What this means is that as we decrease h (the interval length), our error will decrease quadratically. Using the definition of Big-O, we know that $Error = C \cdot h^{2}$ where $C$ is some constant.

In some cases, we may not know the exponent in $E (h) = O (h^{n})$ . We can estimate it using by computing the error at two different values of $h$ . Suppose we have two quantities, $h_{1} = 0.5$ and $h_{2} = 0.25$ . We compute the corresponding errors as $E (h_{1}) = 0.125$ and $E (h_{2}) = 0.015625$ . Then, since $E (h) = O (h^{n})$ , we have:

\begin{array}{rc} \frac{0.125}{0.015625} & = \frac{E (h_{1})}{E (h_{2})} \\ \approx \frac{C h_{1}^{n}}{C h_{2}^{n}} \\ = {(\frac{h_{1}}{h_{2}})}^{n} \end{array}

\begin{array}{r} ⟹ \log (\frac{0.125}{0.015625}) = n \log (\frac{h_{1}}{h_{2}}) = n \log (\frac{0.5}{0.25}) \end{array}

Solving this equation for $n$ , we obtain $n = 3$ .

Big-O Example - Role of Constants

It is important that one does not place too much importance on the constant $M$ in the definition of Big-O notation; it is essentially arbitrary.

Suppose $f_{1} (n) = 10^{- 20} n^{2}$ and $f_{2} (n) = 10^{20} n^{2}$ . While $f_{2}$ is much larger than $f_{1}$ for all values of $n$ , both are $O (n^{2})$ ; this is obvious if we choose any constants $M_{1} \geq 10^{- 20}$ and $M_{2} \geq 10^{20}$ .

However, it is also true that $f_{2} (n) = O (10^{- 20} n^{2})$ for any constant $M \geq 10^{40}$

\begin{array}{r} f_{2} (n) = 10^{20} n^{2} = 10^{40} \times 10^{- 20} n^{2} \leq M \times 10^{- 20} n^{2} . \end{array}

So including a constant inside the $O$ is basically meaningless.

Question: What is the function $g (n)$ that gives the tightest bound on $f_{2} (n) = O (g (n))$ ?

Solution: the answer is $g (n) = n^{2}$ . For any $r < 2$ , there is no constant $M$ such that $| f_{2} (n) | \leq M n^{r}$ for all $n$ sufficiently large. So $n^{r}$ for $r < 2$ is not a bound on $f_{2}$ . For any $q > 2$ , there exist a pair of constants $M_{1}$ and $M_{2}$ such that for all $n$ sufficiently large:

\begin{array}{r} f_{2} (n) \leq M_{1} n^{2} \leq M_{2} n^{q} . \end{array}

However, we cannot find a pair of constants $M_{3}$ and $M_{4}$ such that:

\begin{array}{r} f_{2} (n) \leq M_{3} n^{q} \leq M_{4} n^{2} . \end{array}

Thus, we cannot “fit” another function in between $f_{2} (n)$ and $n^{2}$ , so $n^{2}$ is the tightest bound.

One may be tempted to think the correct answer should actually be $g (n) = 10^{20} n^{2}$ ; however, this does not actually provide any additional information about the growth of $f_{2}$ . Notice that we didn’t specify what $M_{1}$ and $M_{2}$ were in the inequality above. Big-O notation says nothing about the size of the constant. The statements

\begin{aligned} f_{2} (n) & = O (n^{2}), \\ f_{2} (n) & = O (10^{20} n^{2}), \\ f_{2} (n) & = O (10^{- 20} n^{2}), \end{aligned}

are all equivalent, in that they all give the same amount of information on the growth of $f_{2}$ , since the constants are not specified. Since $10^{- 20}$ is very small, it may be tempting to conclude that it is “tighter” than the other two, which is not true. Therefore, it is always best practice to avoid placing unnecessary constants inside the $O$ , and we expect you do refrain from doing so in this course.

Convergence Definitions

Algebraic growth/convergence is when the coefficients $a_{n}$ in the sequence we are interested in behave like $O (n^{α})$ for growth and $O (1 / n^{α})$ for convergence, where $α$ is called the algebraic index of convergence. A sequence that grows or converges algebraically is a straight line in a log-log plot.

Exponential growth/convergence is when the coefficients $a_{n}$ of the sequence we are interested in behave like $O (e^{q n^{β}})$ for growth and $O (e^{- q n^{β}})$ for convergence, where $q$ is a constant for some $β > 0$ . Exponential growth is much faster than algebraic growth. Exponential growth/convergence is also sometimes called spectral growth/convergence. A sequence that grows exponentially is a straight line in a log-linear plot. Exponential convergence is often further classified as supergeometric, geometric, or subgeometric convergence.

Figures from J. P. Boyd, *Chebyshev and Fourier Spectral Methods*, 2nd ed., Dover, New York, 2001.

Review Questions

See this review link

Links to other resources

Big-O Notation

ChangeLog

2020-04-25 Mariana Silva mfsilva@illinois.edu: small text revisions
2020-02-19 Peter Sentz sentz2@illinois.edu: Add section on role of constants, change Big-Oh’s to “mathcal”
2020-01-26 Wanjun Jiang wjiang24@illinois.edu: add scientific notations, digits and figures
2018-01-31 Aming Ni amingni2@illinois.edu: changed three graphs
2018-01-16 Yu Meng yumeng5@illinois.edu: minor fixes throughout
2017-11-02 Erin Carrier ecarrie2@illinois.edu: adds changelog
2017-10-26 Erin Carrier ecarrie2@illinois.edu: adds review questions, minor changes throughout to better match termiology in class notes
2017-10-23 John Doherty jjdoher2@illinois.edu: first complete draft
2017-10-17 Luke Olson lukeo@illinois.edu: outline