Numerical solution of DIVA / SSA

The DIVA and SSA momentum balances share the same 2D depth-integrated form: a nonlinear elliptic problem for the depth-averaged horizontal velocity $\bar{\mathbf u} = (\bar u, \bar v)$ , with coefficients $(\mu, \beta_\mathrm{eff})$ that depend on the solution. Yelmo discretises this problem on the Arakawa C-grid (the ac-grid) and linearises it by Picard iteration, freezing $(\mu, \beta_\mathrm{eff}, H)$ at the previous iterate, solving a linear system for the new $\bar{\mathbf u}$ , relaxing, and repeating until convergence.

Two solvers are provided for the linear step. Both produce the same solution at interior cells (the two systems are related by a sign and a cell-area factor), but they differ in the structure of the assembled matrix and in how boundary conditions are imposed. The choice is made at runtime via the parameter ydyn.ssa_solver = "residual" | "energy", which the DIVA solver dispatches on.

Discretisation conventions

Yelmo uses the Arakawa C / staggered AC grid:

aa-nodes (cell centres): scalars — $H$ , $s$ , $\bar\mu$ , $\bar\mu\,H$ , $\beta$ (before staggering).
acx-nodes (right face of each aa-cell): $\bar u$ , $\tau_{d,x}$ , $\beta_{\mathrm{eff},x}$ .
acy-nodes (top face): $\bar v$ , $\tau_{d,y}$ , $\beta_{\mathrm{eff},y}$ .
ab-nodes (cell corners): cross-coupling viscosity $(\bar\mu H)^{\mathrm{ab}}$ , obtained by averaging $\bar\mu\,H$ from its four neighbouring aa-cells (stagger_visc_aa_ab).

The two unknowns per cell ( $\bar u$ at the right face and $\bar v$ at the top face) are interleaved into a single state vector $\mathbf x = [\,\bar u_{1}, \bar v_{1}, \bar u_{2}, \bar v_{2}, \dots\,]$ , so the assembled linear system has dimension $2 N_\mathrm{cells}$ . The matrix is stored in CSR format and solved with LIS (Library of Iterative Solvers for Linear Systems); the iterative method and preconditioner are configured at runtime via ydyn.ssa_lis_opt.

Per-row solver masks (ssa_mask_acx, ssa_mask_acy) classify each ac-node as one of:

0: Dirichlet zero velocity (frozen bed, edge of domain, etc.);
-1: Dirichlet prescribed velocity (e.g. observed input);
1: solve (interior);
3: lateral / calving-front Neumann row driven by the depth-integrated lateral stress $\tau_{l,\mathrm{int}}$ .

The two assemblers share the same argument list so they are drop-in interchangeable from the Picard loop in calc_velocity_diva.

Solver A — residual form (Yelmo v1)

The residual assembler (solver_ssa_ac.f90) builds a non-symmetric matrix $A_\mathrm{res}$ and right-hand side $\mathbf b_\mathrm{res}$ directly from the strong form of the SSA PDE. Writing $N \equiv \bar\mu\,H$ for brevity, the interior $\bar u$ row is a 5-point stencil in $\bar u$ plus cross-coupling terms in $\bar v$ on the neighbouring acy-faces, balanced against the basal drag and the driving stress. Schematically (cell $(i,j)$ , with $N^{\mathrm{aa}}$ on aa-nodes and $N^{\mathrm{ab}}$ on ab-nodes):

$\sum_{(i',j')}\alpha^{x}_{i',j'}\,\bar u_{i',j'} \;+\; \sum_{(i',j')}\gamma^{xy}_{i',j'}\,\bar v_{i',j'} \;-\; \beta_\mathrm{eff}\,\bar u_{i,j} \;=\; \tau_{d,x}(i,j).$

The $\bar u$ stencil coefficients carry the membrane stretching and shearing,

$\alpha^{x}_{i-1,j} \;=\; \tfrac{4}{\Delta x^{2}}\,N^{\mathrm{aa}}_{i,j},$

$\alpha^{x}_{i+1,j} \;=\; \tfrac{4}{\Delta x^{2}}\,N^{\mathrm{aa}}_{i+1,j},$

$\alpha^{x}_{i,j-1} \;=\; \tfrac{1}{\Delta y^{2}}\,N^{\mathrm{ab}}_{i,j-1},$

$\alpha^{x}_{i,j+1} \;=\; \tfrac{1}{\Delta y^{2}}\,N^{\mathrm{ab}}_{i,j},$

with the centre coefficient $\alpha^{x}_{i,j}$ being minus the sum of its four neighbours. The $\gamma^{xy}$ coefficients couple the row to four neighbouring $\bar v$ -faces via the $\partial \bar v/\partial x$ and $\partial \bar v/\partial y$ cross-terms in the membrane stress. The $\bar v$ row has the analogous structure.

The right-hand side is the driving stress itself (no cell-area factor). The system $A_\mathrm{res}\,\mathbf x = \mathbf b_\mathrm{res}$ is non-symmetric, and is solved with a Krylov method such as BiCGStab plus an algebraic preconditioner.

Lateral / calving-front rows substitute the membrane-stress balance with the prescribed depth-integrated lateral stress $\tau_{l,\mathrm{int}}$ in a row-specific stencil.
Dirichlet rows replace the equation by $\bar u_{i,j} = u^*$ with the corresponding column kept in place (non-symmetric).
Free-slip, no-slip and periodic conditions are applied per side via the boundary-code helper get_neighbor_indices_bc_codes.

This is the formulation inherited from Yelmo v1 and is the legacy default.

Solver B — energy form (new)

The energy assembler (solver_ssa_ac_energy.f90) builds the Hessian of a discrete energy functional and solves $K\,\mathbf x = \mathbf b$ for the velocity that minimises that energy. With $(\mu, \beta, H)$ frozen during each Picard step the energy is quadratic, so $K = \frac{\partial^{2} W}{\partial \mathbf x^{\,2}}$ is symmetric positive (semi-)definite and the linear step can use a symmetric Krylov method — CG with an AMG preconditioner — in place of BiCGStab.

Energy density

The continuum energy density underlying the SSA momentum balance is the sum of a membrane (deformation) term, a basal-drag term, and a gravitational potential-energy term:

$\begin{aligned} W \;=\; \bar\mu\,H\,&\biggl( 2\!\left(\frac{\partial \bar u}{\partial x}\right)^{\!2} + 2\!\left(\frac{\partial \bar v}{\partial y}\right)^{\!2} + 2\,\frac{\partial \bar u}{\partial x}\,\frac{\partial \bar v}{\partial y} + \tfrac{1}{2}\!\left(\frac{\partial \bar u}{\partial y} + \frac{\partial \bar v}{\partial x}\right)^{\!2} \,\biggr) \\[4pt] &+\; \tfrac{1}{2}\,\beta\,(\bar u^{\,2} + \bar v^{\,2}) \;+\; \rho_i\,g\,H\,\!\left(\bar u\,\frac{\partial s}{\partial x} + \bar v\,\frac{\partial s}{\partial y}\right). \end{aligned}$

The first line is $2\,\bar\mu\,H\,\dot{\bar\varepsilon}_{ij}\,\dot{\bar\varepsilon}_{ij}$ written out for the depth-averaged horizontal strain rates. Stationarity of the integral $\mathcal W = \int W \,\mathrm dx\,\mathrm dy$ with respect to $(\bar u, \bar v)$ reproduces exactly the SSA / DIVA strong form, so any critical point of $\mathcal W$ is a solution of the momentum balance.

Discrete assembly

Yelmo's discrete energy is the cell-by-cell evaluation of $W$ on the C-grid, with the derivatives $\frac{\partial \bar u}{\partial x}, \frac{\partial \bar v}{\partial y}, \frac{\partial \bar u}{\partial y}, \frac{\partial \bar v}{\partial x}$ expressed as the natural finite differences between adjacent ac-nodes. The Hessian $K$ then has the same stencil graph as the residual matrix but is symmetric in $(\bar u, \bar v)$ . At inner cells the two formulations are related by an exact algebraic identity (documented in the header of the energy assembler):

$K_\mathrm{inner} \;=\; -\,A_\mathrm{res, inner}\cdot \Delta x\,\Delta y, \qquad \mathbf b_\mathrm{inner} \;=\; -\,\boldsymbol\tau_d \cdot \Delta x\,\Delta y.$

So the energy formulation is, at interior cells, the residual formulation rescaled by the cell area and a sign — the physical solution at interior cells is identical to machine precision. Where the two solvers differ is at boundaries:

Lateral / calving-front BC: the front stress enters the energy as a boundary-work term $\pm\,\tau_{l,\mathrm{int}}\,\Delta y$ on the RHS, with the sign determined by the outward normal. This is the variational form of the Neumann condition and is symmetric by construction.
Dirichlet rows: prescribed values are imposed by static condensation — the prescribed column is multiplied by the known velocity and moved to the RHS — instead of by row replacement. This preserves the symmetry of $K$ and so allows CG / AMG to be used without spoiling the SPD structure.

The viscosity staggering aa $\to$ ab is the same routine (stagger_visc_aa_ab) used by the residual assembler.

Why bother?

Two practical advantages flow from the SPD structure:

Linear solver choice: CG with AMG is typically faster and more robust than BiCGStab + ILU for large, well-conditioned SPD systems, and converges monotonically in the energy norm.
Physical interpretability and discrete consistency: every term in $K$ and $\mathbf b$ corresponds to a contribution to a discrete energy. Boundary conditions that are natural for the continuum functional (e.g. Neumann front stress) become natural for the discrete one. This makes it easier to add new physics — e.g. alternative friction laws, additional body forces — in a way that provably preserves the variational structure.

The Picard loop, the viscosity update, the F-integral closure and the basal-stress and 3D velocity diagnostics are identical between the two solvers: only the linear-system assembly differs.