CS 267, Assignment 0

George Hines

For the assignment statement and objectives, click here.

About the author

I'm a second-year graduate student in Mechanical Engineering, advised by Prof. Andy Packard. My background is in aeronautics and control theory, in which areas I graduated from Caltech in 2008. At present my interests lie at the interface of control/estimation and plant genetics, an area that promises to provide novel sensing techniques for higher productivity in the agro-food industry. My background in parallel computing is limited to the first day of the ParLab's 2008 Parallel Computing Boot Camp, but I find the ideas of parallel computing well-enough motivated and the computational problems in control and estimation growing sufficiently large that knowledge of the ideas and techniques behind parallelism will be useful if not necessary when thinking about control design. By taking this class, I'm hoping to gain enough intuition to be able to design a control architecture with parallelism in mind at various levels (hardware/software).

A sample application of parallel computing: turbulent flow simulation

Brief overview

Since the general governing equations of fluid flow are too complex to solve analytically (if indeed they are solvable), simulation is necessary to build intuition about the behavior of different types of flows under different boundary conditions. Stanford's Center for Turbulence Research, LBNL's Center for Computational Sciences and Engineering, and Prof. Dale Pullin's group at Caltech are leading the way in such simulations, which generally fall into two categories: Direct Numerical Simulation (DNS) and Large Eddy Simulation (LES). Briefly, the former type of simulation is a complete fowared integration of the governing (Navier-Stokes) equations of fluid flow, augmented with additional information as necessary. (For example, a combustion simulation might require specification of the reaction kinetics.) By contrast, a LES uses a model for small-scale turbulence structures and assumes similarity of small structures, leaving a reduced number of larger turbulence structures to be simulated completely. LES is computationally easier, but less accurate. The questions that may be partially answered by turbluance simulations are many and varied. As one example: knowledge of the turbulent flow structure around the wing of an aircraft is useful in determining not only the lift and drag of the wing but also the dominant frequencies at which the wing will be buffetted, indicating the natural frequencies to avoid when choosing construction materials and bracing geometry. Another example that is often discussed in the academic literature is of a flame jet, as this is first a hard problem due to fast time scaling and second a useful model of many exhast processes.

Code verification/validation

Turbulence simulation are often uniquely easy to verfy and validate. A comparatively large body of experimental research has been built up that allows results to be gut-checked for qualitative correctness (verification), and research capabilities have grown to the point where certain simple simulation conditions can be reproduced in the laboratory (one such facility is Caltech's Explosion Dynamics Laboratory). An early example of DNS of fluid flow, executed on Stanford's parallel ILLIAC-IV, was regarded as successful precisely because it qualitatively reproduced the results of an earlier experiment. (A brief description is available here.)

Targeted architecture

Large scale fluids simulations are generaly intended to run on distributed-memory parallel machines, and are therefore coded using a message-passing paradigm. Cook and Riley's foundational paper on the subject highlights optimization of array layouts in memory as the key to squeezing good performance out of a distributed-memory machine and derives scalability and load balancing results for the third-order Adams-Bashforth integration scheme. Cook and Riley's simulations were executed on both a CM-5 and a Cray C90. They claim 30.4 Gflop performance on a 512-node CM5, which is about half of the peak I was able to find listed for a similar computer in use by the NSA in the mid-1990s. Another relevant simulation was executed on a NERSC cluster in 1998 by Boersma, and in this report he specifically mentions coding the simulation with MPI, using "ghost points" to reduce unnecessary communication. The solver routines were from the FISHPAK package. While Boersma does not specifically say on which NERSC machine he ran his code, the CTR's website indicates that they most recently acquired access to the BlueGene facility at LLNL, which holds several of the top 10 spots in the Top 500. Bell and colleagues, working at LBNL, executed a remarkable simulation of what is referred to as a "V-flame" on 256 processors of an IBM SP RS/6000 at NERSC. The report on that work is available here. The CCSE released two reports in 2005 discussing detailed issues relating to large-scale simulations, notably the use of adaptive meshes, on the BlueGene/L. These reports and others are available here.

The future of flow simulation

While one group of flow simulation researchers is routinely exploiting parallelism to get high-fidelity results for large-scale flows, another group is focusing on algorithmic advances that allow reasonable detailed simulations to be executed effectively on small platforms. When resolving large-scale structures there is often no alternative to using huge parallel machines for a long time. The work of Prof. Pullin's group focuses on finer scales of turbulence, and in these cases the geometry of the flow can be sufficiently exploited that the algorithms can be run successfully on microcomputers. See for instance this paper which summarizes a recent Ph.D. dissertation. So as with any simulation endeavor there are tradeoffs. It seems that for large-scale simulations the tradeoff favors parallel (super)computers, but for small-scale simulations it is more practical to exploit underlying problem structure and use microcomputers with parallelization not explicitly considered.