13 Numerical Integration

13.1 The Approximation Problem

The Fundamental Theorem (Theorem 9.4) reduces integration to antidifferentiation: given F with F' = f, we compute \int_a^b f(x)\,dx = F(b) - F(a).

This is computationally trivial—two function evaluations suffice. But the theorem presumes we possess F in closed form. For many functions, this assumption fails.

Consider \int_0^1 e^{-x^2}\,dx. The integrand e^{-x^2} is continuous, hence integrable. The Fundamental Theorem guarantees the integral exists as a definite real number. Yet no combination of polynomials, exponentials, logarithms, and trigonometric functions produces an antiderivative. We proved in Elementary Integrals that e^{-x^2} admits a power series antiderivative, but for practical computation—when a numerical value is required—this does not suffice.

We face a gap: theoretical existence versus computational access. The Riemann integral exists as the limit of approximating sums (Section 8.7), but we defined it as an infinite process—taking finer and finer partitions. To compute numerically, we must halt at finite precision.

This chapter develops systematic methods for approximating \int_a^b f(x)\,dx using only function evaluations at finitely many points. We establish error bounds showing how accuracy improves as we sample more points. The methods derive from the definition of the Riemann integral itself: approximate f by simpler functions (constants, lines, parabolas), integrate the approximation exactly, and control the error.

Historical note. Before calculus, quadrature (finding areas) relied on geometric dissections and limiting arguments—Archimedes’ method of exhaustion being the paradigm. Newton and Leibniz revolutionized this by connecting integration to differentiation, reducing quadrature to finding antiderivatives. But when antiderivatives proved elusive, 18th-century mathematicians (Cotes, Simpson, Euler) developed systematic approximation schemes. These remain foundational in numerical analysis.

Modern relevance. While antiderivatives are theoretically elegant, computational practice often bypasses them. Numerical integration underpins:

Finite element methods (structural engineering, fluid dynamics)
Monte Carlo simulation (statistical mechanics, finance)
Machine learning (training neural networks involves high-dimensional integrals)
Computer graphics (rendering realistic lighting requires integrating over surfaces)

We will mention these connections where natural, but our focus remains mathematical: understanding approximation schemes rigorously.

13.2 Rectangular Approximations

Return to the definition of the Riemann integral. Given partition P = \{x_0, x_1, \ldots, x_n\} of [a,b], we formed sums S(f, P, \{t_i\}) = \sum_{i=1}^{n} f(t_i)(x_i - x_{i-1}), where t_i \in [x_{i-1}, x_i] is arbitrary (Section 8.8). As \|P\| \to 0, these converge to \int_a^b f regardless of sample point choice.

For numerical work, we fix a partition and sample points. The three natural choices:

Left endpoint: t_i = x_{i-1} L_n = \sum_{i=1}^{n} f(x_{i-1})(x_i - x_{i-1}).

Right endpoint: t_i = x_i R_n = \sum_{i=1}^{n} f(x_i)(x_i - x_{i-1}).

Midpoint: t_i = (x_{i-1} + x_i)/2 M_n = \sum_{i=1}^{n} f\left(\frac{x_{i-1} + x_i}{2}\right)(x_i - x_{i-1}).

For uniform partitions with \Delta x = (b-a)/n and x_i = a + i\Delta x, these simplify: L_n = \Delta x \sum_{i=0}^{n-1} f(x_i), \quad R_n = \Delta x \sum_{i=1}^{n} f(x_i), \quad M_n = \Delta x \sum_{i=1}^{n} f(x_{i-1/2}), where x_{i-1/2} = a + (i - 1/2)\Delta x.

Geometric interpretation. Each sum approximates the area under f by rectangles:

L_n uses left heights—underestimates if f increases, overestimates if f decreases
R_n uses right heights—opposite behavior
M_n uses midpoint heights—often more accurate than either endpoint

Example 13.1 (Rectangular Approximations) Approximate \int_0^1 x^2\,dx using n=4 subintervals.

We have \Delta x = 1/4 and x_i = i/4. Compute: \begin{align*} L_4 &= \frac{1}{4}\left[f(0) + f(1/4) + f(1/2) + f(3/4)\right] \\ &= \frac{1}{4}\left[0 + \frac{1}{16} + \frac{1}{4} + \frac{9}{16}\right] = \frac{1}{4} \cdot \frac{14}{16} = \frac{7}{32} = 0.21875. \end{align*}

\begin{align*} R_4 &= \frac{1}{4}\left[f(1/4) + f(1/2) + f(3/4) + f(1)\right] \\ &= \frac{1}{4}\left[\frac{1}{16} + \frac{1}{4} + \frac{9}{16} + 1\right] = \frac{1}{4} \cdot \frac{30}{16} = \frac{15}{32} = 0.46875. \end{align*}

\begin{align*} M_4 &= \frac{1}{4}\left[f(1/8) + f(3/8) + f(5/8) + f(7/8)\right] \\ &= \frac{1}{4}\left[\frac{1}{64} + \frac{9}{64} + \frac{25}{64} + \frac{49}{64}\right] = \frac{1}{4} \cdot \frac{84}{64} = \frac{21}{64} = 0.328125. \end{align*}

The exact value (from the Second Fundamental Theorem) is \int_0^1 x^2\,dx = [x^3/3]_0^1 = 1/3 \approx 0.333\ldots. The midpoint rule gives the closest approximation.

13.3 The Trapezoidal Rule

Rectangular approximations treat f as piecewise constant on each subinterval. We can do better by approximating f with piecewise linear functions.

On [x_{i-1}, x_i], the linear interpolant through (x_{i-1}, f(x_{i-1})) and (x_i, f(x_i)) is \ell_i(x) = f(x_{i-1}) + \frac{f(x_i) - f(x_{i-1})}{x_i - x_{i-1}}(x - x_{i-1}).

Integrate: \int_{x_{i-1}}^{x_i} \ell_i(x)\,dx = \frac{f(x_{i-1}) + f(x_i)}{2}(x_i - x_{i-1}).

This is the area of a trapezoid with parallel sides f(x_{i-1}) and f(x_i) and width x_i - x_{i-1}.

Summing over all subintervals:

Definition 13.1 (Trapezoidal Rule) T_n = \sum_{i=1}^{n} \frac{f(x_{i-1}) + f(x_i)}{2}(x_i - x_{i-1}).

For uniform partitions with \Delta x = (b-a)/n: T_n = \frac{\Delta x}{2}\left[f(x_0) + 2f(x_1) + 2f(x_2) + \cdots + 2f(x_{n-1}) + f(x_n)\right].

The formula has a pleasing structure: endpoints are weighted 1/2, interior points are weighted 1.

Relation to rectangular rules. Observe: T_n = \frac{L_n + R_n}{2}.

The trapezoidal rule is the average of left and right endpoint approximations. This symmetry often improves accuracy.

Example 13.2 (Trapezoidal Rule Application) For \int_0^1 x^2\,dx with n=4: \begin{align*} T_4 &= \frac{1/4}{2}\left[f(0) + 2f(1/4) + 2f(1/2) + 2f(3/4) + f(1)\right] \\ &= \frac{1}{8}\left[0 + 2 \cdot \frac{1}{16} + 2 \cdot \frac{1}{4} + 2 \cdot \frac{9}{16} + 1\right] \\ &= \frac{1}{8}\left[\frac{2}{16} + \frac{8}{16} + \frac{18}{16} + \frac{16}{16}\right] = \frac{1}{8} \cdot \frac{44}{16} = \frac{11}{32} = 0.34375. \end{align*}

This is closer to 1/3 than either L_4 or R_4.

Computational cost. The trapezoidal rule requires n+1 function evaluations (the partition points). Rectangular rules also require n or n+1 evaluations. The improvement comes not from reduced cost but from exploiting function smoothness.

13.4 Simpson’s Rule

The trapezoidal rule approximates f by piecewise linear functions. If f is smooth, we can use piecewise quadratics for better accuracy.

Divide [a,b] into an even number of subintervals: n = 2m. On each pair [x_{2i-2}, x_{2i-1}, x_{2i}], fit a parabola through the three points (x_{2i-2}, f(x_{2i-2})), (x_{2i-1}, f(x_{2i-1})), (x_{2i}, f(x_{2i})) and integrate.

For uniform spacing \Delta x = (b-a)/n, the integral of the parabola over [x_{2i-2}, x_{2i}] equals \frac{\Delta x}{3}\left[f(x_{2i-2}) + 4f(x_{2i-1}) + f(x_{2i})\right].

We defer the derivation of this formula (it follows from Lagrange interpolation), stating the result:

Definition 13.2 (Simpson’s Rule) For n = 2m even and uniform partition: S_n = \frac{\Delta x}{3}\left[f(x_0) + 4f(x_1) + 2f(x_2) + 4f(x_3) + 2f(x_4) + \cdots + 4f(x_{n-1}) + f(x_n)\right].

Pattern: endpoints weighted 1, odd indices weighted 4, even interior indices weighted 2.

Example 13.3 (Simpson’s Rule Exactness) For \int_0^1 x^2\,dx with n=4 (so m=2): \begin{align*} S_4 &= \frac{1/4}{3}\left[f(0) + 4f(1/4) + 2f(1/2) + 4f(3/4) + f(1)\right] \\ &= \frac{1}{12}\left[0 + 4 \cdot \frac{1}{16} + 2 \cdot \frac{1}{4} + 4 \cdot \frac{9}{16} + 1\right] \\ &= \frac{1}{12}\left[\frac{4}{16} + \frac{8}{16} + \frac{36}{16} + \frac{16}{16}\right] = \frac{1}{12} \cdot \frac{64}{16} = \frac{1}{3}. \end{align*}

Simpson’s rule gives the exact answer. This is not coincidence—Simpson’s rule is exact for polynomials up to degree 3, and x^2 is such a polynomial. We establish this shortly.

Historical note. Thomas Simpson (1710–1761) popularized this formula in his textbook Mathematical Dissertations (1743), though Newton knew it earlier. It became the standard for practical quadrature in pre-computer era navigation and astronomy.

13.5 Error Analysis

We have described approximation schemes, but without error bounds, they are useless. How rapidly do T_n and S_n converge to \int_a^b f? How many subintervals are required for prescribed accuracy?

Error analysis is the soul of numerical methods. Without it, we cannot distinguish good approximations from bad, nor predict when to trust computed values.

13.5.1 Trapezoidal Rule Error

Theorem 13.1 (Trapezoidal Rule Error Bound) Let f \in C^2([a,b]) (twice continuously differentiable). Then \left|\int_a^b f(x)\,dx - T_n\right| \leq \frac{(b-a)^3}{12n^2} \max_{x \in [a,b]} |f''(x)|.

The error decreases as O(1/n^2)—doubling n reduces error by a factor of 4. The bound depends on the second derivative: the more f curves, the larger the error.

Proof.

We prove the bound for a single subinterval [x_{i-1}, x_i] and sum.

Let h = x_i - x_{i-1} and assume for simplicity that [x_{i-1}, x_i] = [0, h] (by translation). The trapezoidal approximation is T_i = \frac{h}{2}[f(0) + f(h)].

The exact integral is I_i = \int_0^h f(x)\,dx.

Define the error E_i = I_i - T_i. We seek a bound on |E_i|.

Expand f in Taylor series around x = h/2 (the midpoint): f(x) = f(h/2) + f'(h/2)(x - h/2) + \frac{f''(\xi)}{2}(x - h/2)^2 for some \xi between x and h/2 (by Taylor’s theorem with remainder).

Integrate from 0 to h: \int_0^h f(x)\,dx = f(h/2) \cdot h + f'(h/2) \int_0^h (x - h/2)\,dx + \frac{1}{2}\int_0^h f''(\xi)(x - h/2)^2\,dx.

The middle term vanishes by symmetry: \int_0^h (x - h/2)\,dx = 0. For the last term, bound |f''(\xi)| \leq M_2 = \max_{[0,h]} |f''|: \left|\int_0^h f''(\xi)(x - h/2)^2\,dx\right| \leq M_2 \int_0^h (x - h/2)^2\,dx = M_2 \cdot \frac{h^3}{12}.

Now compute the trapezoidal approximation. By Taylor expansion at the endpoints: f(0) = f(h/2) - f'(h/2) \cdot \frac{h}{2} + \frac{f''(\xi_0)}{2}\left(\frac{h}{2}\right)^2, f(h) = f(h/2) + f'(h/2) \cdot \frac{h}{2} + \frac{f''(\xi_h)}{2}\left(\frac{h}{2}\right)^2.

Adding: f(0) + f(h) = 2f(h/2) + \frac{1}{2}\left[\frac{h^2}{4}(f''(\xi_0) + f''(\xi_h))\right].

Thus: T_i = \frac{h}{2}[f(0) + f(h)] = hf(h/2) + \frac{h^3}{16}[f''(\xi_0) + f''(\xi_h)].

Comparing with the exact integral: E_i = I_i - T_i = \frac{1}{2}\int_0^h f''(\xi)(x - h/2)^2\,dx - \frac{h^3}{16}[f''(\xi_0) + f''(\xi_h)].

Bounding both terms by M_2: |E_i| \leq \frac{M_2}{2} \cdot \frac{h^3}{12} + \frac{M_2 h^3}{8} = \frac{M_2 h^3}{12}.

For the full interval [a,b] with n subintervals of width h = (b-a)/n: \left|\sum_{i=1}^{n} E_i\right| \leq n \cdot \frac{M_2 h^3}{12} = \frac{M_2}{12} \cdot n \cdot \frac{(b-a)^3}{n^3} = \frac{M_2 (b-a)^3}{12n^2}.

Setting M_2 = \max_{[a,b]} |f''| yields the result. \square

Example 13.4 (Error Bound Application) Bound the error for \int_0^1 x^2\,dx using T_4.

We have f(x) = x^2, so f''(x) = 2, giving \max |f''| = 2. The bound is: \left|\int_0^1 x^2\,dx - T_4\right| \leq \frac{(1-0)^3}{12 \cdot 4^2} \cdot 2 = \frac{1}{96} \approx 0.0104.

The actual error is |1/3 - 11/32| = |32/96 - 33/96| = 1/96—the bound is sharp.

13.5.2 Simpson’s Rule Error

Theorem 13.2 (Simpson’s Rule Error Bound) Let f \in C^4([a,b]). Then \left|\int_a^b f(x)\,dx - S_n\right| \leq \frac{(b-a)^5}{180n^4} \max_{x \in [a,b]} |f^{(4)}(x)|.

The error is O(1/n^4)—quadrupling n reduces error by a factor of 256. Simpson’s rule converges much faster than the trapezoidal rule when f is smooth.

Moreover, the error depends on the fourth derivative. If f is a polynomial of degree \leq 3, then f^{(4)} = 0, so the error is zero—Simpson’s rule is exact for cubics.

Proof.

The proof follows the same pattern as for the trapezoidal rule but requires Taylor expansion to fourth order. We omit the details, which involve tedious but straightforward calculus.

The key insight: fitting parabolas matches up to the second derivative at each point. The error comes from the third and fourth derivatives. Remarkably, by symmetry, the third-derivative terms cancel, leaving only fourth-order error. This explains the O(h^4) behavior. \square

Example. For \int_0^1 e^{-x^2}\,dx, bound the error using S_{10}.

We have f(x) = e^{-x^2}. Computing derivatives: f'(x) = -2xe^{-x^2}, \quad f''(x) = (4x^2 - 2)e^{-x^2}, \quad \ldots

The fourth derivative is a degree-4 polynomial times e^{-x^2}. On [0,1], we have |e^{-x^2}| \leq 1. By explicit computation (or symbolic software), \max_{[0,1]} |f^{(4)}(x)| \leq 12 (this is a rough bound).

Thus: \left|\int_0^1 e^{-x^2}\,dx - S_{10}\right| \leq \frac{1^5}{180 \cdot 10^4} \cdot 12 = \frac{12}{1800000} \approx 6.7 \times 10^{-6}.

With 10 subintervals (hence 11 function evaluations), Simpson’s rule achieves six-digit accuracy.

13.6 Adaptive Quadrature

The error bounds in Theorem 13.1 and Theorem 13.2 are global—they apply uniformly over [a,b]. But functions often vary in smoothness. Near a singularity or rapid oscillation, more sampling is needed; elsewhere, fewer points suffice.

Adaptive quadrature allocates computational effort where it’s needed, refining the partition in regions of high error.

The basic idea:

Estimate the integral on [a,b] using n points: I_n \approx \int_a^b f
Estimate using 2n points: I_{2n} \approx \int_a^b f
If |I_{2n} - I_n| < \epsilon (user tolerance), accept I_{2n}
Otherwise, split [a,b] at midpoint c = (a+b)/2 and recursively apply to [a,c] and [c,b]

The difference |I_{2n} - I_n| estimates the error. When error is small, we stop refining. When large, we subdivide and repeat.

Error estimation. For Simpson’s rule, if S_n and S_{2n} denote approximations with n and 2n subintervals: \int_a^b f - S_n \approx C/n^4, \quad \int_a^b f - S_{2n} \approx C/(2n)^4 = C/(16n^4).

Subtracting: S_{2n} - S_n \approx \left(1 - \frac{1}{16}\right)\frac{C}{n^4} = \frac{15C}{16n^4}.

Thus: \int_a^b f - S_{2n} \approx \frac{1}{15}(S_{2n} - S_n).

The difference S_{2n} - S_n estimates the error in S_{2n} up to a constant factor. Requiring |S_{2n} - S_n|/15 < \epsilon ensures |\int_a^b f - S_{2n}| \lesssim \epsilon.

Advantages. Adaptive methods automatically handle irregular integrands without user intervention. They are the basis of modern numerical integration libraries (e.g., scipy.integrate.quad in Python, quadgk in Julia).

Remark on high dimensions. In one dimension, adaptive quadrature is efficient. In higher dimensions, the curse of dimensionality intervenes—grids grow exponentially with dimension. For d-dimensional integrals with n points per axis, we need n^d total evaluations. For d=10 and n=100, this is 10^{20} evaluations, computationally infeasible. Alternative methods (Monte Carlo, quasi-Monte Carlo) become necessary. We briefly discuss this in Section 13.9.

13.7 Newton-Cotes Formulas: General Theory

The trapezoidal and Simpson’s rules are instances of a broader class: Newton-Cotes formulas.

Definition 13.3 (Newton-Cotes Formula) Given n+1 equally spaced points x_0, x_1, \ldots, x_n in [a,b], the closed Newton-Cotes formula approximates \int_a^b f(x)\,dx \approx (b-a) \sum_{i=0}^{n} w_i f(x_i), where the weights w_i are determined by requiring exactness for polynomials of degree \leq n.

Exactness condition. The formula must integrate 1, x, x^2, \ldots, x^n exactly. This yields n+1 linear equations for the n+1 weights w_i.

Examples:

n=1 (Trapezoidal rule): Two points, linear interpolation, exact for degree \leq 1

Weights: w_0 = w_1 = 1/2
n=2 (Simpson’s rule): Three points, quadratic interpolation, exact for degree \leq 3 (!)

Weights: w_0 = 1/6, w_1 = 4/6, w_2 = 1/6
n=3 (Simpson’s 3/8 rule): Four points, cubic interpolation, exact for degree \leq 3

Weights: w_0 = 1/8, w_1 = 3/8, w_2 = 3/8, w_3 = 1/8

The remarkable fact about Simpson’s rule (n=2) is that it achieves degree 3 exactness despite using only three points—one degree higher than expected. This is due to symmetry: odd-degree terms cancel.

Higher-order formulas exist (Boole’s rule, etc.), but they become numerically unstable—high-degree polynomial interpolation through equally spaced points suffers from Runge’s phenomenon (wild oscillations near endpoints). For practical computation, low-order formulas with adaptive refinement are preferred.

13.8 Gaussian Quadrature

Newton-Cotes formulas use equally spaced points. What if we allow points to be placed optimally?

Gaussian quadrature chooses both the points x_i and weights w_i to maximize the degree of exactness. With n points, Gaussian quadrature achieves exactness for polynomials of degree \leq 2n-1—double what Newton-Cotes achieves.

Theorem 13.3 (Gaussian Quadrature) On [-1,1] with n points, the formula \int_{-1}^{1} f(x)\,dx \approx \sum_{i=1}^{n} w_i f(x_i) is exact for all polynomials of degree \leq 2n-1 if and only if the x_i are the zeros of the n-th Legendre polynomial P_n(x).

Legendre polynomials are orthogonal polynomials on [-1,1] with respect to the inner product \langle f, g \rangle = \int_{-1}^{1} f(x)g(x)\,dx. Their zeros are the optimal quadrature points.

Example: n=2 (two-point Gaussian rule).

The second Legendre polynomial is P_2(x) = \frac{1}{2}(3x^2 - 1), with zeros at x_1 = -1/\sqrt{3} and x_2 = 1/\sqrt{3}. The weights are w_1 = w_2 = 1.

The formula: \int_{-1}^{1} f(x)\,dx \approx f(-1/\sqrt{3}) + f(1/\sqrt{3}) is exact for all polynomials of degree \leq 3—matching Simpson’s rule with only two evaluations instead of three.

For general intervals [a,b], use the change of variables t = \frac{2x - (a+b)}{b-a} to map [a,b] to [-1,1].

Advantages. Gaussian quadrature achieves higher accuracy per function evaluation than Newton-Cotes formulas. It is the method of choice when function evaluations are expensive (e.g., solving differential equations at each point).

Disadvantages. The points are non-uniform and irrational, complicating adaptive refinement. For functions with singularities or discontinuities, adaptive Newton-Cotes may perform better.

13.9 Connections to Modern Computation

The methods developed here—rectangular, trapezoidal, Simpson’s, Gaussian—remain foundational in computational science. We briefly survey connections without digressing from our mathematical focus.

13.9.1 High-Dimensional Integration and Monte Carlo

For d-dimensional integrals \int_{\Omega} f(\mathbf{x})\,d\mathbf{x}, \quad \mathbf{x} \in \mathbb{R}^d, deterministic quadrature becomes impractical for d \geq 4 due to exponential growth in grid points.

Monte Carlo integration replaces structured grids with random sampling: \int_{\Omega} f(\mathbf{x})\,d\mathbf{x} \approx \frac{|\Omega|}{N} \sum_{i=1}^{N} f(\mathbf{x}_i), where \mathbf{x}_i are uniformly random points in \Omega. By the law of large numbers, this converges as O(1/\sqrt{N})—independent of dimension.

This underpins:

Statistical mechanics: Computing thermodynamic averages over high-dimensional phase spaces
Bayesian inference: Evaluating posterior probabilities in machine learning
Computer graphics: Rendering realistic lighting via path tracing

Monte Carlo trades deterministic guarantees for dimensional independence. For smooth low-dimensional integrals, Gaussian quadrature is superior. For rough high-dimensional integrals, Monte Carlo is the only feasible approach.

13.9.2 Finite Element Methods

Solving partial differential equations (PDEs) numerically—fluid flow, structural mechanics, electromagnetics—requires integrating over complex domains. The finite element method discretizes the domain into small elements (triangles, tetrahedra) and approximates the solution by piecewise polynomials.

Evaluating integrals over each element uses Gaussian quadrature. The accuracy of the PDE solution depends critically on accurate numerical integration. Simpson’s rule and higher-order Newton-Cotes formulas extend naturally to multi-dimensional elements.

This connects our one-dimensional theory to problems of immense practical importance: designing aircraft wings, simulating climate, modeling protein folding.

13.9.3 Machine Learning and Numerical Integration

Training neural networks involves optimizing \mathcal{L}(\theta) = \int p(x) \ell(f_\theta(x), y(x))\,dx, where p(x) is a data distribution, f_\theta is the network, and \ell is a loss function. In practice, integrals are approximated by sums over mini-batches—a form of Monte Carlo sampling.

More subtly, automatic differentiation (backpropagation) relies on the chain rule, which we developed rigorously in Calculus 1. The gradient \frac{\partial \mathcal{L}}{\partial \theta} propagates through compositions of functions, applying the chain rule at each layer. Understanding derivatives as linear maps (§ Vectors and Linear Maps) clarifies why automatic differentiation works: each layer applies a linear approximation, and the chain rule composes these approximations.

13.10 Remark on Limitations

We have developed methods for univariate integrals of smooth functions. Several important extensions lie beyond our scope:

Singular integrands. Functions like \int_0^1 x^{-1/2}\,dx are integrable but unbounded. Standard quadrature fails near singularities. Specialized techniques (change of variables, subtraction of singularity) are required.
Oscillatory integrands. Functions like \int_0^{\infty} \sin(x^2)\,dx oscillate rapidly. Standard methods converge slowly. Stationary phase and Filon-type methods exploit cancellation between oscillations.
Infinite intervals. For \int_0^{\infty} e^{-x^2}\,dx, we must truncate or transform the interval. Change of variables x = \tan(t) maps [0,\infty) to [0,\pi/2] at the cost of a singular integrand.
Non-smooth functions. If f has jump discontinuities, error bounds depending on higher derivatives fail. Adaptive quadrature can handle this by detecting the jumps, but care is needed.

These are active areas in numerical analysis. Our goal has been to establish the foundational methods and error analysis framework. Extensions follow the same principles: approximate, bound error, refine adaptively.

13.11 Summary

We have developed the theory of numerical integration from first principles:

Rectangular rules arise directly from the Riemann integral definition, providing O(1/n) convergence
Trapezoidal rule approximates via piecewise linear interpolation, achieving O(1/n^2) convergence
Simpson’s rule uses piecewise quadratic interpolation, achieving O(1/n^4) convergence for smooth functions
Error bounds quantify convergence rates rigorously, connecting to higher derivatives

These methods form the computational backbone of modern scientific computing. While we focused on one dimension, the principles—approximation by simpler functions, error control via derivatives, adaptive refinement—extend to higher dimensions and underpin finite element methods, Monte Carlo simulation, and numerical solutions of differential equations.

The key insight remains: the Riemann integral is an infinite process, but finite approximations suffice when controlled rigorously. Understanding the error tells us when to stop refining—and that understanding flows directly from the analytical tools of calculus.