 |
|
 |
|
|
|
|
|

|
|
2004 SCIENCE NEWS
02.17.04 NAS Scientific Consultant Improves CFD Code Performance
Using a combination of techniques and an SGI Origin 2000 supercomputer at the NASA Advanced Supercomputing (NAS) Facility, Sherry Chang, Scientific Consultant (NASE) obtained a speedup of more than four times on the code, Real Gas Aerodynamic Simulator (RGAS), developed by NASA's Seokkwan Yoon. One of the functions of the Scientific Consulting group (NASE) is to help scientists port and optimize their codes on new computer platforms. The techniques applied to such a process depend heavily on the architecture of the machine, the structure of the code, the compiler, and the performance tuning tools.
The RGAS code is a computational fluid dynamics (CFD) code used to simulate supersonic combustion in scramjet (supersonic combustion ramjet) engines. Scramjet is an essential mode of operation for air-breathing rocket propulsion systems, which consume oxygen in the air rather than from an oxygen tank, and thus offer clear advantages over conventional engines by making vehicles lighter and more efficient. In addition, with an easy modification of the chemistry model of the code, RGAS can be used for simulation and analysis of re-entry vehicles such as the crew exploration vehicle (CEV) in the future.
RGAS is a 6,000-line serial code written in Fortran 77. With the removal of the old Cray vector machines, where this code used to run on, Yoon ported RGAS to the Origin machines. For a medium grid size, this code took about 40 hours for 200 iterations on the NAS O2K machine Steger. It was estimated that for larger grid sizes (approximately two orders of magnitude larger), it would take approximately one year of wall time to finish one run. Yoon thus requested assistance from the Scientific Consulting group to improve the performance of this code on the Origins so that it becomes feasible for studying larger cases.
Analysis performed using the SGI tools, perfex and ssrun, indicates that RGAS suffers significant TLB (Translation Lookaside Buffer) and cache misses. These performance bottlenecks are common problems encountered when porting codes from vector machines to the Origins. By turning on a higher level of optimization during compilation, increasing the default page size using an SGI tool, dplace, and hand-modifying the source code, the performance penalties by TLB and cache misses have been significantly decreased. A simulation of 200 iterations for the test case with the medium grid size now takes less than nine hours of wall time, which is more than a four times speedup over the 40 hours originally obtained.
The optimization obtained for the current model is al so applicable to the CEV flow analysis in the future.
|
|
|
|
ARCHIVE
|
|
|