Subsection1.1.2Newton's method in the real domain

Let's take a more general look at Newton's method. The problem is to estimate a root of a real, differentiable function \(f\text{,}\) i.e. a value of \(x\) such that \(f(x)=0\text{.}\) Suppose also that we have an initial guess \(x_0\) for the actual value of the root, which we denote \(r\text{.}\) Often, the value of \(x_0\) will be based on a graph. Since \(f\) is differentiable, we can estimate \(r\) with the root of the tangent line approximation to \(f\) at \(x_0\text{;}\) let's call this point \(x_1\text{.}\) When \(x_0\) is close to \(r\text{,}\) we often find that \(x_1\) is even closer. This process is illustrated in Figure 2 where it indeed appears that \(x_1\) is much closer to \(r\) than \(x_0\text{.}\)

Figure1.1.2One step in Newton's method

We now find a formula for \(x_1\) in terms of the given information. Recall that \(x_1\) is the root of the tangent line approximation to \(f\) at \(x_0\text{;}\) let's call this tangent line approximation \(\ell\text{.}\) Thus,

\begin{equation*} \ell(x) = f(x_0) + f'(x_0)(x-x_0) \end{equation*}

and \(\ell(x_1)=0\text{.}\) Thus, we must simply solve

\begin{equation*} f(x_0) + f'(x_0)(x-x_0) = 0 \end{equation*}

for \(x\) to get \(x_1 = x_0-f(x_0)/f'(x_0)\text{.}\)

Now, of course, \(x_1\) is again a point that is close \(r\text{.}\) Thus, we can repeat the process with \(x_1\) as the guess. The new, better estimate will then be

\begin{equation*} x_2 = x_1 - \frac{f(x_1)}{f'(x_1)}. \end{equation*}

The process can then repeat. Thus, we can define a sequence \((x_n)\) recursively by

\begin{equation*} x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}. \end{equation*}

This process is called iteration and the sequence it generates often converges to the root \(r\text{.}\)

Subsubsection1.1.2.1Examples

We now present several examples to illustrate the variety of things that can happen when we apply Newton's method. Throughout, we have a function \(f\) and we defined the corresponding Newton's method iteration function

\begin{equation*} N(x) = x - \frac{f(x)}{f'(x)}. \end{equation*}

We then iterate the function \(N\) from some starting point or, perhaps, from several starting points.

Example1.1.3

We start with \(f(x)=x^2-2\text{.}\) Of course, \(f\) has two roots, namely \(\pm\sqrt{2}\text{.}\) Thus, we might think, of the application of Newton's method to \(f\) as a tool to find good approximations to \(\sqrt{2}\text{.}\)

First, we compute \(N\text{:}\)

\begin{align*} N(x) &= x-f(x)/f'(x) = x-\frac{x^2-2}{2x} \\ &= x-\left(\frac{x^2}{2x}-\frac{2}{2x}\right) = x-\left(\frac{x}{2}-\frac{1}{x}\right) = \frac{x}{2}+\frac{1}{x}. \end{align*}

Now, suppose that \(x_0=1\text{.}\) Then,

\begin{align*} x_1 &= N(1) = \frac{1}{2}+\frac{1}{1} = \frac{3}{2}\\ x_2 &= N(3/2) = \frac{3/2}{2}+\frac{1}{3/2} = \frac{17}{12}\\ x_3 &= N(17/12) = \frac{17/12}{2}+\frac{1}{17/12} = \frac{577}{408} \end{align*}

Note that

\begin{equation*} (577/408)^2 = 332929/166464 = 2+1/166464 \end{equation*}

so that third iterate is quite close to \(\sqrt{2}\text{.}\)

Note that we've obtained a rational approximation to \(\sqrt{2}\text{.}\) At the same time, it's clear that it would be nice to perform these computations on a computer. In that context, we might generate a decimal approximation to \(\sqrt{2}\text{.}\) Here's how this process might go in Python:

Note how quickly the process has converged to 12 digits of precision.

Of course, \(f\) has two roots. How can we choose \(x_0\) so that the process converges to \(-\sqrt{2}\text{?}\) You'll explore this question computationally in Exercise 1.3.1. It's worth noting, though, that a little geometric understanding can go a long way. Figure 4, for example, shows us that if we start with a number \(x_0\) between zero and \(\sqrt{2}\text{,}\) then \(x_1\) will be larger than \(\sqrt{2}\text{.}\) The same picture shows us that any number larger than \(\sqrt{2}\) leads to a sequence that converges to \(\sqrt{2}\text{.}\)

Figure1.1.4Three steps in Newton's method for \(f(x)=x^2-2\)

Example1.1.5

We now take a look at \(f(x)=x^2+3\text{.}\) A simple look at the graph of \(f\) shows that it doesn't even hit the \(x\)-axis; thus, \(f\) has no roots. It's not at all clear what to expect from Newton's method.

A simple computation shows that the Newton's method iteration function is

\begin{equation*} N(x) = \frac{x}{2} - \frac{3}{2x}. \end{equation*}

Note that \(N(1)=-1\) and \(N(-1)=1\text{.}\) In the general context of iteration that we'll consider later, we'll say that the points \(1\) and \(-1\) lie on an orbit of period 2 under iteration of the function \(N\text{.}\)

Sequences of other points seem more complicated so we turn to the computer. Suppose that we change the initial seed \(x_0=1\) just a little tiny bit and iterate with Python.

Well, there appears to be no particular pattern in the numbers. In fact, if we generate 1000 iterates an plot those that lie within 10 units of the origin on a number line, we get Figure 6. This is our first illustration of chaotic behavior. Not just because we see points spread all throughout the interval but also because we appeared to have a stable orbit when \(x_0=1\text{.}\) Why should the behavior be so different when we change that initial seed to \(x_0=0.99\text{?}\)

Figure1.1.6Chaotic behavior from Newton's method

Example1.1.7

Newton's original example had one real root, Example 3 had two real roots and Example 5 had no real roots. Let's take a look at an example with lots real roots, namely \(f(x)=\cos(x)\text{.}\)

Generally, the closer your initial seed is to a root, the more likely the sequence starting from that seed is to converge to that root. What happens, though, if we start some place that's not so close to a root? What if we start close to the maximum - near zero? Let's investigate in code.

OK, let's pick this code apart. The inner for loop looks like so:

xi = random()/10
for i in range(8):
  xi = n(xi)

Thus, xi is set to be a random number between \(0\) and \(0.1\text{;}\) the for loop then iterates the Newton's method function in for the cosine from that initial seed 8 times. The outer for loop simply performs this experiment 10 times. After each run of the experiment, we print the resulting \(xi\) - along with the value of the cosine at that point, to check that we're indeed close to a root of the cosine. It's striking that we get 9 different results over 10 runs even though the starting points are so close to one another.

Figure 8 gives some clue as to what's going on. Recall that we can envision a Newton step for a function \(f\) from a point \(x_i\) by drawing the line that is tangent to the graph of \(f\) at the point \((x_i,f(x_i))\text{.}\) The value of \(x_i\) is then the point of intersection of this line with the \(x\)-axis. Because the slope at the maximum is zero, the value of this point of intersection is very sensitive to small changes. In fact, there are infinitely many roots of the cosine any one of which could be hit by some initial seed in this tiny interval.

Figure1.1.8Initial Newton steps for the cosine