| Parallel implementation of compact numerical schemes - :6479 | National Weather Service (NWS) Select up to three search categories and corresponding keywords using the fields to the right. Refer to the Help section for more detailed instructions.

Parallel implementation of compact numerical schemes
• Published Date:
2001
• Series: [PDF-1.97 MB]

Details:
• Personal Authors:
• Keywords:
• Description:
The formulation of compact schemes in a parallel environment is described in detail for the case of differencing. Compact numerical schemes are spatially implicit methods for obtaining spatial derivatives and partial integrations, interpolating from one grid to its staggered counterpart, and so on, where the target results occupy a regular grid as dense as the regular grid of source data. Among the numerical schemes to perform such operations at a given formal order of accuracy, the compact forms usually yield the smallest principal truncation error coefficients and, when properly coded for a shared-memory machine, require essentially minimal extra computational cost. However, the inherently recursive nature of the application of compact schemes poses a problem for their efficient implementation on distributed-memory machines, which we set out to address in this note. Generally, a compact scheme is expressed for an entire line of data simultaneously by a linear system characterized by banded matrices on both sides of the equation. The solution involves a preliminary "L-D-U" decomposition of the system matrix on the left-hand side of the equation, where L and U are lower and upper triangular banded matrices with unit main diagonals, D is a diagonal matrix. For the common case in which the original system matrix is symmetric, factors L and U are the transposes of each other and the decomposition is a modified Cholesky factorization. The components of the resulting factors provide the coefficients for a pair of one-sided recursions required by the subsequent application of the scheme to lines of data. For unbounded (effectively infinite or cyclic) lines of data on a uniform grid, the recursions simplify to the extent of having constant coefficients and it is convenient to retain this simplicity, by appropriate manipulations of the scheme's end conditions, even when finite boundaries intrude. We show, using three different implementation strategies, that each such constant-coefficient recursion can then be carried out in a reasonably efficient manner across several processors of a distributed-memory computer although the cost now becomes significantly greater than the cost of applying the conventional difference scheme of the corresponding order of accuracy. Nevertheless, the compact schemes remain viable within a massively parallel atmospheric model where high-order numerics are desired. Spatial filters with recursive components, such as the various Butterworth filters, involve similar numerical considerations and, as we show, may be handled in a parallel computing environment in the same way.

• Document Type:
• Collection(s):
• Place as Subject:
• Supporting Files: