Debugging, Profiling, and Validation
by Dave Levermore
AMSC 664 lecture, 31 January 2003
Once code is written you must begin three tasks that should continue
so long as the code is in use:
These tasks should be carried out on each component (module) of your
code in "isolation". You should be aware of and use all tools at your
disposal. You should keep a log of both your profiling and validation
efforts. There should be a section addressing each in your final
- debugging --- getting the code to work as you
- profiling --- assessing how the code carries out a
given scientific task on a given platform and how its
performance might be improved;
- validation --- assessing how accurately the code
carries out a given scientific task.
Debugging has two stages:
The first stage must be completed before doing anything else. It must
be revisited every time new coding is introduced. The second stage
begins during the profiling and validation processes, but should
continue throughout the useful life of the code. Of course, most
compilers come with debugging tools that help you through the first
stage. There are also some platform-dependent debugging tools as well
as many general strategies that can be applied to the second stage.
Examples will be given in subsequent lectures.
- to get the code to compile and run,
- to see that it runs as intended.
Profiling involves the insertion of diagnostics into your
code to assess its performance on a given platform. You want to find
One can use such information to tune your code's performance. Many of
the tools with which you make such assessments are platform dependent.
For example, one can check round-off sensitivity by masking the two
least significant bits of the calculation. In some cases this can be
done at the level of the compiler while in others one has to insert
some lines into your code. Other examples will be given in subsequent
- What is the sensitivity to round-off error?
- In which subroutines and loops is the code spending its time?
- On parallel platforms, is the load balanced?
Is there a synchronization bottleneck?
Are there communication delays?
Validation of both your model and your algorithms is
critical to the development of any piece of scientific software.
How one might do this is the focus of this lecture. We will mention
several approaches that are geared toward traditional simulation
codes, but most of them have analogs that can be applied to any
scientific computation project.
- Hand Calculation Checks. These are among the most basic
algorithm checks. Simply set up small scale calculations that
you can check by "hand". For example, check a linear system
solver on a two-by-two matrix. Similarly, check a two
dimensional simulator on a two-by-two grid. For such small scale
problems you can look at every number produced by your code and
compare it with your hand calculation (usually carried out on a
programmable calculator or some other computer). It does not
matter that such under-resolved calculations are not physically
relevant. Such checks are used primarily for debugging. Because
they are time consuming, they should not be the first thing you
try. However, if a possible bug is suspect at any stage of your
validation process you should consider using one.
- Basic Stability Studies. These are also among the most
basic tests of your algorithms. You must check that simple
physically stable states are indeed stable when simulated by your
code. For example, in a gas dynamics setting, does your codes
maintian a constant solution with uniform density, velocity, and
temperature? Of course, you should check that such a solution is
in fact a stable solution of your model (or a reduction thereof)
before carrying out its simulation. If your simulation is then
unstable then the problem can be either a bug in the code (see
above), the fact you have chosen an unstable algorithm, or both.
- Symmetry Invariance Studies. These simply compute two
or more problems that are computationally identical up to a
rotation, reflection, or translation to see if the simulations
behave as they should.
For example, a one-dimensional problem with slab symmetry can be
set up on a grid first from left to right and then from right to
left. If these simulations were to be carried out with a perfect
algorithm on a perfect platform they would produce the same
numbers up to a reflection. However, in a realistic setting one
should see differences that should be understood, especially if
they are significantly larger than the expected round-off error.
Such differences might be caused by a bug like a subtle index
mistake in a loop or a boundary value specification. They might
also arise due to an asymmetry in your algorithm. For example,
if your code includes a Gaussian elimination subroutine it may
always back-solve from right to left.
Another example in the same spirit is to compute a problem with
periodic boundary conditions and its shift by half a period.
Any funny looking behavior at a computational boundary that does
not then shift to the middle of the problem is almost certainly
an indication of a bug, most likely in your imposition of the
periodic boundary condition.
- Symmetry Preservation Studies. These simulate solutions
of the model (or a reduction thereof) that have more symmetries
than can be preserved by the simulations. For example, how well
does your code compute solutions with spherical symmetry? (There
is always symmetry breaking in the simulation of such solutions
when they are computed by a multi-dimensional code except in some
trivial cases.) How much symmetry breaking one sees will depend
the stability of the sphereically symmetric solution. Similarly,
in a multi-dimensional code one can simulate a solution with slab
symmetry on a grid aligned with the plane of symmetry and on a
grid oblique to the plane of symmetry. In general one will see
differences in the speed at which the waves propagate.
- Comparisons with Special Solutions. Many models have
reductions for which analytic solutions can be found. You should
use such solutions to validate your algorithms. These can
include spatially homogeneous solutions, self-similar solutions,
traveling-wave and other steady-state solutions. Almost as good
as an analytical solution is a special solution that can be
computed by numerically integrating an ordinary differential
equation. Such a reduction can be found through symmetries.
- Convergence Studies. These can be applied on the level
of a given subroutine to the level of a full calculation. For
example, many efficient (linear or nonlinear) system solvers use
iterative algorithms. Both the convergence rate and the stopping
criterion should be validated on a few systems representive of
the systems your code will face in a typical simulation. If a
convergence rate is known theoretically for your algorithm, your
code should converge at that rate (allowing for round-off).
For a code that simulates the solution of a system of differential
equations the convergence rate should be validated as the spatial
grid is refined and as the time-step is reduced. To compute
error one must study either a special solution, a well-resolved
benchmark calculation, or both.
- Comparisons with Other Code. One of the most useful
methods to validate both your algoritms and your model is through
code comparisons. This can be done either by installing various
options in your own code or by using another code.
When validating an algorithm, comparisons should be done on the
same model. For example, give yourself the capability to choose
from among several algorithms in your code. Algorithms that are
known to be accurate can be used to validate faster algorithms.
For example, the direct solution of a linear system can be used
to validated an iterative solver, even if the direct solver is
far too slow to be used in a full simulation. Similarly, one
iterative solver can be used to validate another. Validation on
the level of a full simulation can be studied by comparing with
a simulation by a mature code that uses completely different
algorithms. For example, you can compare a Monte Carlo
simulation with a finite element simulation of the same model.
When validating a model, comparisons should be done either with
the same algorithm or in such a way as to minimize the effect of
any algorithm differences. For example, when validating a model
of chemical reactions, one should compare it to a more complete
model solved by the same algorithm. On the other hand, when
comparing a diffusive model of particle transport with a kinetic
one the algorithms will be very different.
- Parametric Sensitivity Studies. One should have a sense
of how sensitive your simulations are to uncertainties in your
model. You should know if a small change in the value of some
parameter in your model will lead to a large change in a critical
predicted value. There are three basic approaches to sensitivity
studies: ensembles, linearization, and adjoints. These will be
discussed later in the term.
- Comparisons with Benchmark Calculations. In any mature
area of scientific computation there is usually a collection of
so-called benchmark calculations in the literature to which you
can compare your simulations. A benchmark calculation is usually
carried out using a well-validated "full physics" model on a
state-of-the-art platform over weeks or months. While such
comparisions should never replace comparisons with experimental
data, their value is that they usually offer more data than an
experiment can provide.
- Comparisons with Experimental Data. These are the
ultimate tests for any scientific code. Before making any such
comparison you should check that the experiment is being carried
out in a regime where your model is valid. If it is, then you
had better reproduce the data to within its error bars. This is
particularly impressive if your simulations were done before you
saw the experimental data, and even more so if your code
predicted an unanticipated phenomenon.