Integrating functions numerically using SciPy
Integration can be interpreted as the area that lies between a curve and the xaxis, signed according to whether this area is above or below the axis. Some integrals cannot be computed directly, using symbolic means, and instead have to be approximated numerically. One classic example of this is the Gaussian error function, which was mentioned in the Basic mathematical functions section in Chapter1, Basic Packages, Functions, and Concepts. This is defined by the formula
and the integral that appears here cannot be evaluated symbolically.
In this recipe, we will see how to use the numerical integration routines in the SciPy package to compute the integral of a function.
Getting ready
We use the scipy.integratemodule, which contains several routines for computing numerical integrals. We import this module as follows:
from scipy import integrate
How to do it...
The following steps describe how to numerically integrate a function using SciPy:
- We evaluate the integral that appears in the definition of the error function at the value x = 1. For this, we need to define the integrand (the function that appears inside the integral) in Python:
def erf_integrand(t):
return np.exp(-t**2)
There are two main routines in scipy.integratefor performing numerical integration (quadrature) that can be used. The first is the quadfunction, which uses QUADPACK to perform the integration, and the second is quadrature.
- The quadroutine is a general-purpose integration tool. It expects three arguments, which are the function to be integrated (erf_integrand), the lower limit (-1.0), and the upper limit (1.0):
val_quad, err_quad = integrate.quad(erf_integrand, -1.0, 1.0)
# (1.493648265624854, 1.6582826951881447e-14)
The first returned value is the value of the integral and the second is an estimate for the error.
- Repeating the computation with the quadratureroutine, we get the following. The arguments are the same as for the quad routine:
val_quadr, err_quadr = integrate.quadrature(erf_integrand, -1.0,
1.0)
# (1.4936482656450039, 7.459897144457273e-10)
The output is the same format as the code, with the value of the integral and then an estimate of the error. Notice that the error is larger for the quadratureroutine. This is a result of the method terminating once the estimated error falls below a given tolerance, which can be modified when the routine is called.
How it works...
Most numerical integration techniques follow the same basic procedure. First, we choose points xifor i = 1, 2,…, n in the region of integration, and then use these values and the values f(xi) to approximate the integral. For example, with the trapezium rule, we approximate the integral by
where a < x1< x2< … < xn-1< b and his the (common) difference between adjacent xivalues, including the end points aand b. This can be implemented in Python as follows:
def trapezium(func, a, b, n_steps):
"""Estimate an integral using the trapezium rule"""
h = (b - a) / n_steps
x_vals = np.arange(a + h, b, h)
y_vals = func(x_vals)
return 0.5*h*(func(a) + func(b) + 2.*np.sum(y_vals))
The algorithms used by quadand quadratureare far more sophisticated than this. Using this function to approximate the integral of erf_integrandusing trapeziumyields a result of 1.4936463036001209, which agrees with the approximations from the quad and quadrature routines to 5 decimal places.
The quadratureroutine uses a fixed tolerance Gaussian quadrature, whereas the quadroutine uses an adaptive algorithm implemented in the Fortran library QUADPACK routines. Timing both routines, we find that the quad routine is approximately 5 times faster than the quadrature routine for the problem described in the recipe. The quad routine executes in approximately 27 µs, averaged over 1 million executions, while the quadrature routine executes in approximately 134 µs. (Your results may differ depending on your system.)
There's more...
The routines mentioned in this section require the integrand function to be known, which is not always the case. Instead, it might be the case that we know a number of pairs (x,y) with y = f(x), but we don't know the function fto evaluate at additional points. In this case, we can use one of the sampling quadrature techniques from scipy.integrate. If the number of known points is very large and all points are equally spaced, we can use Romberg integration for a good approximation of the integral. For this, we use the rombroutine. Otherwise, we can use a variant of the trapezium rule (as above) using the trapzroutine, or Simpson's rule using the simps routine.