An integral-like numerical approach for solving Burgers' equation

An unconventional approach is applied to solve the one-dimensional Burgers' equation. It is based on spline polynomial interpolations and Hopf-Cole transformation. Taylor expansion is used to approximate the exponential term in the transformation, then the analytical solution of the simplified equation is discretized to form a numerical scheme, involving various special functions. The derived scheme is explicit and adaptable for parallel computing. However, some types of boundary condition cannot be specified straightforwardly. Three test cases were employed to examine its accuracy, stability, and parallel scalability. In the aspect of accuracy, the schemes employed cubic and quintic spline interpolation performs equally well, managing to reduce the $\ell_{1}$, $\ell_{2}$ and $\ell_{\infty}$ error norms down to the order of $10^{-4}$. Due to the transformation, their stability condition $\nu \Delta t/\Delta x^2>0.02$ includes the viscosity/diffusion coefficient $\nu$. From the condition, the schemes can run at a large time step size $\Delta t$ even when grid spacing $\Delta x$ is small. These characteristics suggest that the method is more suitable for operational use than for research purposes.


Introduction
The partial differential equation (PDE) of the form The standard numerical approach to tackle the non-linear term is to linearize it by assuming that u in uu x is locally constant, as done in previous works such as [2,3,4,5,6,7,8].However, this approach usually has limitations, particularly in terms of accuracy.Huang and Abduwali [6] successfully developed an explicit and unconditionally stable scheme using this assumption.Another approach is to rewrite the non-linear term as (u 2 ) x and solve it accordingly as in [9,10,11,12,13,14].However, these methods often involve complicated, iterative or implicit schemes that require solving a large set of linear equations, which can limit computational speed and parallel scalability of the programs.
On the other hand, some researchers, e.g., [15,16,17,18,19,20,21,22,23,24], have applied numerical Hopf-Cole transformation and solved the resulting diffusion equation instead.Advanced schemes have been employed to accurately diffuse exponential profiles generated by the Hopf-Cole transformation, while the transformation itself is generally approximated using a finite number of terms of its infinite series form or integrated numerically using Gaussian quadrature.
In this paper, we introduce an integral-like approach that mimics mathematical transformations and temporal integration to advance the numerical solution in time.In contrast to finite discretization, we employ spline interpolation to represent gridded data as a continuous function.The general idea of this approach is explained in Section 2. In Section 3, the time-stepping method used in solving the diffusion equation is described as an example and as an indispensable part of the scheme for solving Burgers' equation.In Section 4, the numerical Hopf-Cole transformation is explained and the entire scheme is composed with additional remarks on programming aspect.In Section 5, the results of several numerical experiments are shown, including an example of solving Burgers'-Fisher equation.Conclusions are presented in Section 6.

Integral-like approach
The proposed integral-like approach is different from conventional methods in both numerical differentiation in space and numerical forwarding scheme in time.To solve a PDE, our data grid contains not only the necessary field variables but also their derivatives, so that the variables can be represented as a continuous function using spline polynomial interpolation between adjacent grid points.To advance the variables in time, mathematical procedures along with additional approximations, based on the corresponding analytical solution, are emulated using the known continuous spline polynomial function.The derivatives also need to be updated in time.Initializing the derivatives using a finite difference method has been found to Table 1: Integral-like schemes with their associated spline polynomial interpolations P j for u, u x and u xx between x j and x j+1 where y = x − x j and 0 ≤ y < ∆x

Scheme
Local spline polynomial P j (y) Data grid Linear (LG) Quintic (QG) a j y 5 + b j y 4 + c j y 3 + d j y 2 + e j y + f j u 5a j y 4 + 4b j y 3 + 3c j y 2 + 2d j y + e j u x 20a j y 3 + 12b j y 2 + 6c j y + 2d j u xx be sufficient.Since we can split terms in a PDE and solve more but simpler PDEs successively, as discussed in [36], the approach can be applied to any Burger's-type equation.In Section 5, a Burgers'-Fisher equation, , is split and solved as an example.
In this paper, we investigate integral-like methods that utilize linear, cubic and quintic spline interpolations.These methods will be referred to as linear grid scheme (LG), cubic grid scheme (CG) and quintic grid scheme (QG) respectively, as in Table 1.For the cubic and quintic schemes, the secondorder central finite difference is used to initialize u x and u xx , except at the two end points where the first-order forward and backward finite difference are employed instead.
To forward the numerical solution of the Burgers' equation in time, the Hopf-Cole transformation is applied numerically, the resulting diffusion equation is solved in the Hopf-Cole space and then converted back to the normal space.The application of the integral-like approach to the diffusion equation is discussed first to familiarize readers with the general idea of the method.It is worth noting that the integral-like approach is an explicit scheme that incorporates mathematical formulas and is equivalent to a Semi-Lagrangian method when applied to the linear advection equation.

Method for linear diffusion
The linear diffusion equation, ϕ t = νϕ xx , where ν is a positive diffusion constant, can be solved analytically using the Fourier transform.This ap-proach is discussed in textbooks such as [37] and [38].The analytical solution is with analytical boundary conditions ϕ(∞, t) = ϕ(−∞, t) = 0.The time-stepping scheme of the integral-like approach can be derived from Eq. (3.1) by substituting the spline polynomial function P j (y, t) and changing the time interval.In the case of the cubic scheme (CG) with equallyspacing grid points, Eq. (3.1) becomes Due to the Gaussian decay term, the summation range can be limited to j such that the relative distance ℓ i,j ≡ ξ j − x i is within a margin of five standard deviations, i.e., 5σ = 5 √ 2ν∆t, which is chosen for this paper.It was found experimentally that all three methods are numerically stable as long as the maginal range of 5σ is larger than the grid spacing ∆x.This is evidenced in Figure 7. Mathematically, the condition is equivalent to d ≥ 0.02, where d = ν(∆t)/(∆x) 2 is a non-dimensional diffusion number.Interestingly, the stability condition for these methods is reversed compared to the stability condition of explicit finite difference schemes.Since the 5σ length is an estimation, the value 0.02 is not exact.Notably, the stability condition is independent of u.
Due to the summation of j within the marginal range, the time complexity of the integral-like method is O((5σ/∆x) , where n x is the number of grid points and n t is the total number of time step.Evidently, the complexity of the algorithm is not linear.
Another implication of the marginal range is that the boundary conditions may have to be specified by a small set of points, instead of a single point exactly at the boundary.For example, in the case of periodic boundary condition, the required outside point −j on the left of the considered domain corresponds to the inside point n − j.Similarly, the values at the outside point n x + j on the right are that of the j-th point.For Dirichlet and no-flux boundary conditions, reflected points or their mirror images can be used.Adding extra grid points to both ends can keep the implementation simple and is adopted here, as it also matches with the domain decomposition method for parallelization.
Next, to quantify Eq. (3.2), the indefinite integral of the form, y m exp(−(y+ ℓ) 2 /δ) dy, where m is a positive integer, is evaluated recursively using Eqs.(3.3) -(3.5), which derived by applying integration by parts. where On the other hand, the coefficients of the cubic spline polynomial function P j (y), i.e., a j , b j , c j and d j , are found by solving Eqs.(3.6) -(3.9) at each time step.These coefficients are analytically solved beforehand to speed up the program.
We denote ∂ x ϕ j as the first derivative of ϕ at grid point j.These derivatives are stored in a data grid and are updated by using Eq.(3.10), which derived by differentiating Eq. (3.2) with respect to x i .
Hence, Eqs.(3.2) and (3.10) together form the complete time-stepping method for the integral-like cubic scheme.Their computation is facilitated by Eqs.(3.3) -(3.9).The derivation of the linear scheme and the quintic scheme follows a similar approach as the cubic scheme, but with a different form of spline polynomial P j (y) substituted in, and with a different number of data grids to be stored and iterated in time.

Method for Hopf-Cole transformation
For the cubic scheme, the first step in performing the Hopf-Cole transformation, changing u to ϕ (Eq.1.2), is to compute the integral of u in Eq. (4.1).

Int(u)
Then, the exponent −(1/2ν) Int(u) in Eq. (1.2) is represented as a quintic spline polynomial function using Int(u), u and u x .The quintic coefficients are found by solving six linear equations developed analogously to Eqs. (3.6) -(3.9).Then, for a quintic polynomial Q i of the i-th segment, the value of ϕ between x i and x i+1 is written as a Taylor series T i of exp(Q i ).In our implementation, the number of terms in the Taylor series are varied adaptively, up to 16 terms, to ensure that the deviation of T i is less than 0.01%.The resulting ϕ or T i , which satisfying the diffusion equation, is then solved using the method discussed in Section 3.
On the other hand, the Hopf-Cole transformation from ϕ to u in Eq. (1.3) is done by direct substitution.For example, u and u x of the cubic scheme are deduced from ϕ, ϕ x and ϕ xx by using Eqs.(4.2).
To sum up, an integral-like scheme for solving Burgers' equation consists of the following components: (1) Numerical Hopf-Cole integration scheme for transforming from u space to ϕ space, (2) Numerical diffusion scheme for advancing ϕ(x, t) and its required derivatives to ϕ(x, t + ∆t), (3) Numerical Hopf-Cole differentiation scheme for transforming ϕ(x, t + ∆t) and its derivatives back to u space, and (4) Spline interpolation method for representing discrete data points as a continuous function for the computation in (1)-(3).One complete time iteration step ∆t for solving Burgers' equation is composed of (1), ( 2) and (3).In addition, a finite difference scheme is employed for computing the initial derivatives such as u x (x, t 0 ) from u(x, t 0 ).Our source code is publicly available on GitHub (https://github.com/SKanoksi).
According to our implementation, because the numerical Hopf-Cole integration step involves rapid exponential decay/growth, numerical round-off error must be carefully minimized.At least double-precision floating-point data type may be a minimum.Moreover, additional memory allocation is required for the derivatives of the cubic scheme and the quintic scheme, resulting in space complexity of O(4n x ) and O(6n x ) respectively.The common factor of 2 is because a temporary array is needed for keeping updated value at each position, before entirely transferred to the primary array in the final step.For our current implementation, another n x array is used to temporarily store Int(u).
An implementation of an adaptive grid was explored, where grid points are redistributed based on the path length of u, numerically approximated using Gaussian quadrature.However, the numerical solution was found to be less accurate due to numerical errors introduced when repeatedly rearranging the grids.Therefore, our unsuccessful implementation of the adaptive grid is briefly noted in this paper to inform the possibility that adaptive grid may not improve integral-like schemes in general.

Numerical experiment
In this section, four example cases are used to evaluate the integral-like methods in Table 1.The accuracy of the methods is tested using Example 1 and 2, while numerical stability and parallel scalability are evaluated using Example 3. Example 4 investigates the viability of the split approach for more complicated problems.
To quantify the accuracy of the schemes when comparing with exact/analytical solutions f (x), the ℓ 1 , ℓ 2 and ℓ ∞ error norms are adopted.These error norms are calculated as shown below, for equally-spacing grid.
Example 1. Burgers' equation (Eq.1.1) with the initial condition u(x, 0) = sin(πx) (5.4) and the exact solution where are considered.Unlike in [33] and [39], periodic boundary conditions are employed.The exact solution and numerical results are displayed in Figure 1 for ν = 0.01, ∆x = 0.02, and ∆t = 0.005, while the numerical values at some grid points are given in Table 2.The error norms for four different pairs of ∆x and ∆t are shown in Table 3.
From the results, the linear scheme (LG) always performs the worst, while the cubic scheme (CG) and the quintic scheme (QG) are relatively comparable.For CG and QG, as the grid spacing and time step size are reduced, the error norms may not further decrease as seen in Figure 4, where ∆x is varied from 0.005 to 0.04.This behavior may be caused by truncation error introduced when approximating exp and erf functions, which are frequently used in the schemes.As a result, the order of accuracy of CG and QG remains nearly unchanged, while LG is approximately second-order accurate, but probably becomes constant at around ℓ 2 = 10 −3 as well.LG CG QG and the exact solution where t ≥ 1 and t 0 = exp(0.125/ν)are considered as in [40], [33] and [39].In our case, the Dirichlet boundary conditions are modeled using the inverted reflection of the solution.
The results of this example, using ν = 0.005, ∆x = 0.02 and ∆t = 0.02, are shown in Figure 3 and Table 4, while Table 5 and Figure 4 provide more insight on their order of accuracy.The results in Figure 4 exhibit similar features as in Figure 2 of previous example, except for the oscillation of CG and QG at small ∆x.The swing may be due to the accumulation of errors, similar to Runge's and/or Gibbs phenomenon near the steep slope.This behavior does not occur in Example 1, probably because the abrupt change is stationary at the grid point x = 0, while, in this example, the change is not always on a grid points as ∆x is varied.1) with a step initial condition at x = 0 is used to study both the stability and parallel scalability of the integral-like method.The exact solution for infinite boundary conditions u(−∞, t) = 1 and u(∞, t) = 0 is which becomes a traveling-wave solution thereafter about t = 10.Using ν = 1, ∆x = 0.6 and ∆t = 0.04, the results are shown in Figure 5 and 6 for different experimental setups.In Figure 7, the ℓ 2 error norm is plotted against the non-dimensional diffusion number d = ν∆t/∆x 2 .It can be seen that the stability condition of the integral-like method is not d < 1, as also indicated by Figure 2 and 4, but rather d > 0.02.This finding demonstrates that, unlike most simple explicit methods, the integral-like method is stable at large ∆t.On the other hand, the condition d > 0.02 implies that the marginal range 5σ has to be larger than the grid spacing, as discussed in Section 3. The remaining problem encountered when ν is very small can be resolved by simply increasing ∆t.
A weak-scaling experiment was performed on the TARA cluster of the NSTDA supercomputer center (ThaiSC) to test the parallel efficiency of the integral-like method.Both the number of grid points and the number of LG CG QG  √ 2ν∆t approximately unaltered.In other words, roughly the same number of u(x j , t) is used in updating u(x i , t + ∆t).
With N s representing a running variable, the number of employed CPU cores is N core = N 2 s operating on the total number of grid points n x = 500 N s + 1 for domain [−15, 400 N s + 20] to run the simulation from t 0 = 10 to t = 800 N s + 10.The time step size ∆t is varied to explore the influence of the marginal range on parallel scalability.This is because the bigger the marginal range is, the larger the overlapped areas of the domain decomposi-  TARA compute nodes, equipped with two Intel Xeon Gold 6148 CPU (2.40GHz) and 192GB of RAM, are employed.Our program was coded in Python and parallelized using mpi4py library, before being ported to C language by using Cython and compiled on TARA using foss-2021b toolchain.The weak-scaling parallel efficiency R(1)/R(N s ), where R(N s ) is wall-clock runtime of the N s scaled case, is plotted in Figure 8 from N s = 1 to N s = 10.The results of the serial runtime R(1) are provided in Table 6.
From Figure 8, the parallel efficiency of the integral-like schemes decreases as a larger time step size ∆t is chosen.However, from Table 6, the wall-clock   Using the same setup as in Figure 5 of Example 3, the numerical results of CG and QG agree well with the exact solution as seen in Figure 9.The integral-like approach, therefore, has the potential for solving other Burger'stype equations as well.

Conclusions
In this study, a numerical approach based on the continuous representation of variables using local polynomial interpolation is explored.When applied to solve the one-dimensional Burgers' equation, the schemes derived using linear (LG), cubic (CG) and quintic (QG) interpolations are found to be numerically stable at large time step size ∆t, with a stability condition of ν∆t > 0.02∆x 2 .However, the order of accuracy is not consistently improved when a smaller grid spacing is employed; nonetheless, the smallest possible error norms are around the order of 10 −4 .Being an explicit scheme, their weak-scaling parallel efficiency is generally adequate.This approach shows promise for operational applications that favor reliability, fast computation and good parallel scalability, such as numerical weather prediction.

Figure 8 :
Figure 8: Weak-scaling parallel efficiency of the integral-like schemes for Burgers' equation using time step size ∆t = 0.2, 0.8 and 3.2.

Example 4 .∂x 2 − 2
Burgers3u(1 − u)(1 − 2u)(5.10)    is considered to show a possible extension of the integral-like method.Applying the split approach, which separates terms in the equation into stages and successively solve them, to Eq. (5.10), the stage equations are(1 − u)(1 − 2u) (5.12)The first stage, Eq. (5.11), is to solve the Burger's equation; therefore, the procedures discussed in previous sections are directly employed.The second stage, Eq. (5.12), is a growth/decay equation.Its analytical solution is found by solving Therefore, the integral-like scheme for this second stage isu(x i , t+∆t) x i , t) − 1)2From[41], an exact solution of Eq. (5.10) when ν = 1 is u(x, t)

Table 2 :
Numerical results and exact solution of Example 1 at t = 1.0 using ν = 0.1 and ∆t = 0.01

Table 4 :
Numerical results and exact solution of Example 2 at t = 2.4 using ν = 0.005 and ∆t = 0.02

Table 6 :
The wall-clock serial runtime R(1) of each integral-like scheme for different time step size ∆t, but ran for the same simulation time, i.e., from t 0 = 10 to t = 810.larger time step size is evidently lower.The parallel efficiency also declines as the problem size becomes larger and more CPU cores are used, but remains roughly unchanged at moderate-to-large scale cases.Therefore, the time step size of operational applications may need to be experimentally found to match available computational resources.