Back

Solving 2D diffusion equation in Python

01/04/2020

The 2D diffusion equation is a very simple and fun equation to solve, from which we can generate quite pretty 2D plots with. In this post I go through a set of ideas that I accumulated over the years while I was studying similar problems. The post's idea is keep all this knowledge for myself if I ever need it again. A short version of the differential equation can be expressed as \[ \dot{f} = c \nabla^2 f + s \] and we want to find a function of space and time \( f(X, t) \) that satisfies the equation above. \( c \) is a constant and \( s(X, t) \) is a "source term". Assuming a bidimensional domain, and using a finite differences discretization scheme for time and space, with an equally spaced mesh of same size on both \( x \) and \( y \), which will be called just \( \Delta a \), the equation can be rewritten as follows. \[ \frac{f^{x,y} - f_o^{x,y}}{\Delta t} = c \Bigg{[} \frac{f^{x+1,y} - 2 f^{x,y} + f^{x-1,y}}{\Delta a^2} + \frac{f^{x,y+1} - 2 f^{x,y} + f^{x,y+1}}{\Delta a^2} \Bigg{]} + s^{x,y} \] This generates the set of linear equations, for each element \( f^{x, y} \): \[ [\Delta a^2 + 4 c \Delta t]f^{x,y} c \Delta t f^{x+1, y} c \Delta t f^{x-1, y} c \Delta t f^{x, y+1} c \Delta t f^{x, y-1} = \Delta a^2 f_o^{x, y} + \Delta a^2 \Delta t s^{x, y} \] Which can be written as a linear system \( Ax = b \) with \( N \times N \) equations, where \( N \) is the number of elements in the discretized domain, \( A \) is a matrix of coefficients, \( x \) are the unknown and \( b \) is a vector of known values, dependent on previous time and the source term. \[ \begin{bmatrix} \Delta a^2 + 4 c \Delta t & c \Delta t & 0 & ...\\ c \Delta t & \Delta a^2 + 4 c & c \Delta t & ...\\ 0 & c \Delta t & \Delta a^2 + 4 c & ...\\ 0 & 0 & c \Delta t & \ddots \end{bmatrix} \begin{bmatrix} f^{0,0}\\ f^{1,0}\\ f^{2,0}\\ \vdots \end{bmatrix} = \begin{bmatrix} \Delta a^2 f_o^{0,0} + \Delta a^2 \Delta t \ s^{0, 0}\\ \Delta a^2 f_o^{1,0} + \Delta a^2 \Delta t \ s^{1, 0}\\ \Delta a^2 f_o^{2,0} + \Delta a^2 \Delta t \ s^{2, 0}\\ \vdots \end{bmatrix} \] For the boundary conditions, a simplification will be made in such a way that the values are assumed to be known and equal to zero.

There are many ways to solve the linear system of equations. In this post, I compare the runtime for five possible methods: (a) matrix inversion, (b) LU decomposition, (c) sparse LU decomposition, (d) sparse ILU decomposition and (e) GMRes. I have no intention of implementing all those methods - They are all available in the scipy library. In order to isolate each method, a common interface is proposed:


def solve_2d_inv_A():
    x = np.dot(np.linalg.inv(A), b)
    return x.reshape((N, N))

def solve_2d_linalg_solve():
    x = np.linalg.solve(A, b)
    return x.reshape((N, N))

def solve_2d_linalg_spsolve():
    x = linalg.spsolve(sp_A_csr, b)
    return x.reshape((N, N))

def solve_2d_linalg_ilu():
    invA = linalg.spilu(sp_A_csc)
    x = invA.solve(b)
    return x.reshape((N, N))

def solve_2d_linalg_gmres():
    x, _ = linalg.gmres(sp_A_csr, b)
    return x.reshape((N, N))

solver_names = [
    'inv_A',
    'dense_LU',
    'sparse_LU',
    'sparse_ILU',
    'GMRes'
]
solver_name_to_function = {
    'inv_A': solve_2d_inv_A,
    'dense_LU': solve_2d_linalg_solve,
    'sparse_LU': solve_2d_linalg_spsolve,
    'sparse_ILU': solve_2d_linalg_ilu,
    'GMRes': solve_2d_linalg_gmres,
}
The code to build the \( A \) matrix and the \( b \) vector is shown below:

def build_A():
    to_flat_index = lambda x, y: x * N + y
    A = np.zeros(shape=(N * N, N * N))
    for i in range(N):
        for j in range(N):
            x = to_flat_index(i, j)

            y = to_flat_index(i, j)
            A[x, y] = ds**2 + 4 * c * dt
            if i - 1 >= 0:
                y = to_flat_index(i - 1, j)
                A[x, y] = -c * dt
            if i + 1 < N:
                y = to_flat_index(i + 1, j)
                A[x, y] = -c * dt
            if j - 1 >= 0:
                y = to_flat_index(i, j - 1)
                A[x, y] = -c * dt
            if j + 1 < N:
                y = to_flat_index(i, j + 1)
                A[x, y] = -c * dt
    return A


def build_b():
    fo = np.zeros(shape=(N, N,))
    fo[5:15, 5:15] = 10.0
    b = ds ** 2. * fo.reshape((N * N,))
    return b
It is possible to verify how the runtime evolves with the mesh size in the table below. It is important to notice that iterative methods such as GMRes give an approximation of the solution, so it would be relevant to evaluate the accuracy of the result if adopting such methods.
N inv_Adense_LUsparse_LUsparse_ILUGMRes
10 0.402125 0.205415 0.374370 0.369797 1.273823
15 2.672651 0.916614 0.721103 0.752904 1.398468
20 10.229078 3.763351 1.211724 1.346811 1.421030
25 40.561536 14.481547 2.017594 2.254834 1.514201
30 NaN 38.811441 2.959303 3.436828 1.646307
35 NaN NaN 3.828993 4.631033 1.825023
40 NaN NaN 5.206256 6.056734 1.865046
One last thing to mention is the memory issue in the current implementation will prevent us to grow the mesh too much. Only as an exercise, imagine how much memory would be required to store a \( N \times N \) mesh for a big \( N \) (The size of the matrix \( A \) is \( (N \times N)^2 \)). To solve this issue, it is important to notice that the matrix \( A \) is sparse. To take advantage of this property, it is possible to use Numpy's sparse structures, such as `coo_matrix`. The code for the assembly of the matrix would be something like the following:

coo_i = []
coo_j = []
coo_data = []
def set_A(x, y, value):
    coo_i.append(x)
    coo_j.append(y)
    coo_data.append(value)

for i in range(N):
    for j in range(N):
        x = to_flat_index(i, j)
        y = to_flat_index(i, j)
        set_A(x, y, ds**2 + 4 * c * dt)
        if i - 1 >= 0:
            y = to_flat_index(i - 1, j)
            set_A(x, y, -c * dt)
        if i + 1 < N:
            y = to_flat_index(i + 1, j)
            set_A(x, y, -c * dt)
        if j - 1 >= 0:
            y = to_flat_index(i, j - 1)
            set_A(x, y, -c * dt)
        if j + 1 < N:
            y = to_flat_index(i, j + 1)
            set_A(x, y, -c * dt)

A = coo_matrix((coo_data, (coo_i, coo_j)), shape=(N * N, N * N))

All the previously mentioned methods are single-threaded. There's one parallel linear solver available on Intel MKL named [PARDISO (Parallel Direct Solver)](https://software.intel.com/en-us/articles/intel-mkl-pardiso), but by the time this post was written, it didn't have any Python bindings. There are several C++ to Python binding libraries out there, such as CTypes, Boost::Python, or even Cython. For this particular example, I used Pybind11 to expose a pre-configured PARDISO solver to Python. By passing numpy arrays directly to the bindings, to avoid object copies, the function returns the `x` solution vector. The module code is available below. The code for PARDISO configuration itself is huge (and not really relevant, since it is just a matter of reading the docs and choosing parameters). With that implementation, it is possible to solve the same problem as before, but in a multithreaded environment, provided by PARDISO.


PYBIND11_MODULE(pypardiso, m)
{
    m.def("setup", &pardiso_setup);
    m.def("solve",
        [](
            py::array_t const& rows_A,
            py::array_t const& cols_A,
            py::array_t const& A,
            py::array_t const& py_b
        )
        {
            auto n = int(py_b.size());
            auto x = py::array_t({size_t(n)});
            pardiso_solve(A.data(), rows_A.data(), cols_A.data(), py_b.data(), x.mutable_data(), n);
            return x;
        }
    );
}
This is a very brief overview on how to solve the 2d diffusion equation using some knowledge I acquired over the years while working on my Master's degree. Although the project from my Master's was not related with the 2d diffusion equation, some mathematical and computational problems intersect, and thus having this parallel and simpler 2d side project was a good way to help me keep going at the time.


Back