Note:The first section of this blog post is mathematical and useful to understand, however, if you are just looking for an implementation you can skip through to the first code block.

Partial Differential Equations are notoriously difficult to solve analytically; in all but the simplest cases, there often does not exist a solution in elementary form. This would be acceptable, but for the ubiquity of partial differential equations in physical models. So Mathematicians and Computer Scientists had to develop methods of solving PDEs numerically.

In this blog post, I will show one such method of solving Laplace’s equation (a type of PDE) in $\mathbb{R}^{2}$ using the method of successive over-relaxation, an iterative technique which involves splitting the relevant domain into a grid and sampling each point in a fixed order. For those who don’t know, or have forgotten, Laplace’s equation is written as follows:

$$\nabla^{2} \psi = 0$$

In 2D this can be written:

$$\frac{\partial^{2} \psi}{\partial x^{2}} + \frac{\partial^{2} \psi}{\partial y^{2}} = 0$$

We can use the second-order central difference method: $f^{\prime\prime}(x) \approx \frac{f(x+h) – 2f(x) + f(x-h)}{h^{2}}$ and the definition of a partial derivative to get the following:

$$\frac{\psi(x+\delta x, y) – 2\psi(x) + \psi(x – \delta x,y)}{(\delta x)^{2}}+\frac{\psi(x,y+\delta y) – 2\psi(x,y) + \psi(x,y-\delta y)}{(\delta y)^{2}}=0$$

Noting that if we sample at every $\delta x$ in the $x$-direction and $\delta y$ in the $y$-direction, then we can rewrite this as:

$$\frac{\psi_{i+1,j}-2\psi_{i,j}+\psi_{i-1,j}}{(\delta x)^{2}}+\frac{\psi_{i,j+1}-2\psi_{i,j}+\psi_{i,j-1}}{(\delta y)^{2}}=0$$

Rearranging this to make $\psi_{i,j}$ the subject, we have:

$$\psi_{i,j}=\frac{(\psi_{i+1,j}+\psi_{i-1,j})/(\delta x)^{2} + (\psi_{i,j+1}+\psi_{i,j-1})/(\delta y)^{2}}{2\left((\delta x)^{-2} + (\delta y)^{-2}\right)}$$

This lends itself to a simple iterative formula called the Gauss-Seidel method, which (for $\delta x = \delta y = \delta$) reads:

$$\psi_{i,j}^{m+1}=\frac{1}{4}\left(\psi_{i+1,j}^{m+1}+ \psi_{i-1,j}^{m}+\psi_{i,j+1}^{m+1}+\psi_{i,j-1}^{m}\right)$$

Where $\psi_{i,j}^{m}$ is the $m$-th iteration at position $(i,j)$ in the sample-space grid and where we sample from the top-right corner and sample downwards going from right-to-left along each row. It turns out that we can improve upon the convergence rate of this algorithm by introducing an over-relaxation parameter $\alpha$ and following this iterative formula:

$$\psi_{i,j}^{m+1} = \alpha\frac{(\psi_{i+1,j}^{m+1}+\psi_{i-1,j}^{m})/(\delta x)^{2} + (\psi_{i,j+1}^{m+1}+\psi_{i,j-1}^{m})/(\delta y)^{2}}{2\left((\delta x)^{-2} + (\delta y)^{-2}\right)} + (1-\alpha)\psi_{i,j}^{m}$$

Which for $\delta x = \delta y = \delta$ is simplified considerably to yield:

$$\psi_{i,j}^{m+1}=\psi_{i,j}^{m} + \frac{\alpha}{4}\left(\psi_{i+1,j}^{m+1}+ \psi_{i-1,j}^{m}+\psi_{i,j+1}^{m+1}+\psi_{i,j-1}^{m} – 4\psi_{i,j}^{m}\right)$$

Where $\alpha \in (1,2)$ has a different optimal value depending on the problem. It turns out that for a 2D grid with Dirichlet boundaries, the optimal value of $\alpha$ is the smallest root of the equation (this is stated here at the bottom of page 81):

$$\left[\cos\left(\frac{\pi}{p}\right)+\cos\left(\frac{\pi}{q}\right)\right]\alpha^{2}-16\alpha – 16 = 0$$

Where $p$ and $q$ are the number of sampling points in the $x$- and $y$-directions. This root can be computed (taking numerical stability into account), using the following formula:

$$\alpha = \frac{2}{1+\sqrt{1-\left(\frac{\cos(\pi/p)+\cos(\pi/q)}{2}\right)^{2}}}$$

Which for $p = q$, simplifies to give:

$$\alpha = \frac{2}{1 + \sin\left(\frac{\pi}{p}\right)}$$

Now that we have understood the mathematical basis for the successive over-relaxation method, we can move on to implementation. We start by defining the following templated prototype:

template <typename T> std::size_t LaplaceSolver( std::vector<std::vector<T>>& vecGrid, T tDelta = T(1.0), std::size_t sMaxNumIterations = 30, T tEps = std::numeric_limits<T>::epsilon() );

We return a `std::size_t`

which reports back the number of iterations performed. We note that we can determine the Height and the Width of the grid from the `std::vector`

the function is supplied with (where we will be storing our computed values). So the beginning of the function looks like:

static_assert(std::is_arithmetic<T>::value, "Laplace's equation can only be solved for numeric types"); std::size_t sHeight = vecGrid[0].size(), sWidth = vecGrid.size(); T tOverRelaxationParameter = T( 0.5 ); // N.B: 0.5 rather than 2.0 because else we divide by 4.0 later. if( sHeight == sWidth ) { tOverRelaxationParameter /= T( 1.0 ) + std::sin( M_PI / sHeight ); } else { tOverRelaxationParameter /= T( 1.0 ) + std::sqrt( T(1.0) - std::pow(T(0.5) * (std::cos(M_PI / sHeight) + std::cos(M_PI / sWidth)), 2U) ); }

Here we just calculate the over-relaxation parameter $\alpha$ using the formula above, and define variables for our height and width so we don’t have to keep polling the `std::vector`

object. We are now ready to write the main loop of the algorithm and finish the function:

std::size_t sCurrIteration = 1U; bool bPrecisionAchieved = false; while( sCurrIteration <= sMaxNumIterations && !bPrecisionAchieved ) { bPrecisionAchieved = true; for( std::size_t sY = sHeight - 2U; sY >= 1U; --sY ) { for( std::size_t sX = sWidth - 2U; sX >= 1U; --sX ) { T tResidual = vecGrid[sX][sY + 1U] + vecGrid[sX][sY - 1U] + vecGrid[sX - 1U][sY] + vecGrid[sX + 1U][sY] - T( 4.0 ) * vecGrid[sX][sY]; vecGrid[sX][sY] += tOverRelaxationParameter * tResidual; bPrecisionAchieved = bPrecisionAchieved && (std::fabs(tResidual) < tEps); } } ++sCurrIteration; } return sCurrIteration;

We are now ready to test our algorithm and make sure that it works as expected. As a side note, it is often useful to write a timing function to measure the performance of the function, using C++11’s variadic templates, we can use something like the following for most purposes:

template <typename Func, typename... Args> auto TimedSection( const std::string& szBeginMessage, const std::string& szEndMessage, Func f, Args... args ) -> typename std::result_of<Func(Args...)>::type { std::cout << szBeginMessage << std::endl; std::chrono::high_resolution_clock clock; auto timeStart = clock.now(); auto result = f( args... ); auto timeEnd = clock.now(); auto duration = timeEnd - timeStart; std::cout << szEndMessage << std::endl; std::cout << "Running Time: " << std::chrono::duration_cast<std::chrono::milliseconds>(duration).count() << "ms" << std::endl; return result; }

We can test the algorithm by working backwards from the solution in some region $R \subset \mathbb{R}^{2}$. If we consider the solution $\psi(x,y)=\sin(x)\sinh(y)$ in the region $R = [0,1]\times[0,1]$, then we have the following boundary conditions:

$$\psi(0,y) = 0 \land \psi(x,0) = 0 \land \psi(1,y) = \sin(1)\sinh(y) \land \psi(x,1)=\sin(x)\sinh(1)$$

Thus we can write the following test code using a $7 \times 7$ grid:

int main( int argc, char* argv[] ) { std::vector<std::vector<double>> vecGrid( 7U, std::vector<double>( 7U, 1.0 ) ); double dbDelta = 1.0 / 6; // Set up boundary conditions: for( std::size_t s = 0U; s < 7U; ++s ) { vecGrid[0][s] = 0.0; vecGrid[6][s] = std::sin( 1.0 ) * std::sinh( s * dbDelta ); } for( std::size_t s = 0U; s < 7U; ++s ) { vecGrid[s][0] = 0.0; vecGrid[s][6] = std::sin( s * dbDelta ) * std::sinh( 1.0 ); } // Solve Laplace's equation on the grid: unsigned int iNumIterations = TimedSection( "Beginning LaplaceSolver...", "Finished LaplaceSolver...", LaplaceSolver<double>, std::ref( vecGrid ), dbDelta, 300, std::numeric_limits<double>::epsilon() ) - 1U; std::cout << "Number of iterations: " << iNumIterations << std::endl; return 0; }

If we set a breakpoint in the debugger at the `return 0;`

statement, and analyze our vector, we will see that we have the correct solution (to machine epsilon) after only about 45 iterations.

We can visualize the accuracy of this technique in the following image, which shows the algorithm being run on a $7 \times 7$ grid and a $21 \times 21$ grid and finally a density plot of the analytic solution.

This is a useful technique, especially for simple and small systems; and can be easily extended to consider boundaries more complex than rectangles by adding a masking structure (in this case I’ve used a `std::map`

that maps each sampling coordinate to a `bool`

which determines whether to update the value or not (N.B: this also presents a useful optimization opportunity by allowing us to no longer . An extended implementation (that overloads the existing `LaplaceSolver`

) is given below:

template <typename T> std::size_t LaplaceSolver( std::vector<std::vector<T>>& vecGrid, std::map<std::pair<std::size_t, std::size_t>, bool> mapMask, T tDelta = T( 1.0 ), std::size_t sMaxNumIterations = 30, T tEps = std::numeric_limits<T>::epsilon() ) { static_assert(std::is_arithmetic<T>::value, "Laplace's equation can only be solved for numeric types"); std::size_t sHeight = vecGrid[0].size(), sWidth = vecGrid.size(); T tOverRelaxationParameter = T( 0.5 ); // N.B: 0.5 rather than 2.0 because else we divide by 4.0 later. if( sHeight == sWidth ) { tOverRelaxationParameter /= T( 1.0 ) + std::sin( M_PI / sHeight ); } else { tOverRelaxationParameter /= T( 1.0 ) + std::sqrt( T( 1.0 ) - std::pow( T( 0.5 ) * (std::cos( M_PI / sHeight ) + std::cos( M_PI / sWidth )), 2U ) ); } std::size_t sCurrIteration = 1U; bool bPrecisionAchieved = false; while( sCurrIteration <= sMaxNumIterations && !bPrecisionAchieved ) { bPrecisionAchieved = true; for( std::size_t sY = sHeight - 2U; sY >= 1U; --sY ) { for( std::size_t sX = sWidth - 2U; sX >= 1U; --sX ) { auto it = mapMask.find( std::make_pair( sX, sY ) ); if( it != mapMask.end() && it->second == false ) continue; T tResidual = vecGrid[sX][sY + 1U] + vecGrid[sX][sY - 1U] + vecGrid[sX - 1U][sY] + vecGrid[sX + 1U][sY] - T( 4.0 ) * vecGrid[sX][sY]; vecGrid[sX][sY] += tOverRelaxationParameter * tResidual; bPrecisionAchieved = bPrecisionAchieved && std::fabs( tResidual ) < tEps; } } ++sCurrIteration; } return sCurrIteration; }

I hope this is useful to someone; I am likely to do a future blog post discussing parallelized solutions to PDEs (and include these in my AMP Numerics Library).

**ZIP containing .cpp file:** LaplaceSolver