News & Highlights

NAS Optimizations Attain 2x Speedup for US3D Hypersonics Modeling Code

05.27.09

To date, the NASA Advanced Supercomputing (NAS) Division's effort to optimize the US3D computational fluid dynamics (CFD) code has resulted in a 2.6x speedup in the solve routine, and a 2x speedup for the code overall, when executing a standard real-world test case of over 30 million grid elements on the Pleiades supercomputer.

US3D is a next-generation CFD code that utilizes an unstructured grid-based approach. The code is under development by a team at the University of Minnesota, led by I. Nompelis, and supports the Aerodynamics, Aerothermodynamics and Plasmadynamics (AAP) discipline of the NASA Hypersonics Project. US3D offers significant advancement in the ability to accurately model complex geometries at high fidelity with appreciably improved rates of convergence.

NAS' optimization effort has been underway for just a few months. Continuing work to reorder the significant indirect addressing in the code should yield at least another factor of two in overall performance improvement within a similar timeframe, providing a 4x win overall.

US3D

The US3D code supports both tetrahedral and prismatic cells. Because of its piecewise structured grid, the code can make use of line implicit successive over-relaxation (SOR) techniques to significantly improve solution convergence rates. The matrix solve and viscous flux calculations are the major time-consuming components in this code. As with other codes, the matrix solve dominates runtime.

Early NAS optimization work identified a number of operations in the solve that could be converted to Real*4. Using 32-bit floating-point data and operations (rather than the default 64-bit) allowed for more effective use of cache, memory bandwidth, inter-nodal communication, and memory storage. The off-diagonal terms of the matrix computation were converted first, as were a number of temporary storage arrays. Other optimizations included loop re-ordering, loop fusion, and additional loop unrolling.

Further code restructuring was also done to allow the compiler to generate code that takes advantage of floating point acceleration found in the "pipelining" features of the Xeon processor (used in Pleiades).

For more information about this activity, please contact:

Jim Taft
james.r.taft@nasa.gov

For information about NASA and agency programs on the Web, visit:
http://www.nasa.gov/home

Contact Us

NASA Advanced Supercomputing Division

General inquiries about the Division

Website-related issues or comments

24x7 User Assistance

General comments or Question about NASA

Tell Us About It

We welcome your input on features and topics that you would like to see included on this website.

Please send us email with your wish list and other feedback.

http://www.nas.nasa.gov/publications/news/2009/05-27-09.html